Project 4: Trading Strategy Backtester

Build a backtesting engine that replays historical market data, simulates order execution, and calculates P&L - the essential tool every quant uses daily to validate strategy ideas before risking real capital.


Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 2-3 weeks
Languages Rust (Primary), C++, Python, Julia (Alternatives)
Prerequisites Basic data structures, file I/O, understanding of trading concepts
Key Topics Event-driven architecture, market simulation, performance metrics, time-series data
Coolness Level Level 4: Quant Essential
Portfolio Value Interview Gold (especially for quant roles)

1. Learning Objectives

By completing this project, you will:

  1. Master event-driven architecture: Design a system where market events drive strategy execution in correct temporal order
  2. Understand market microstructure: Learn how orders get filled, what slippage means, and why backtests lie
  3. Implement realistic fill simulation: Model partial fills, slippage, and latency to create more accurate simulations
  4. Calculate quantitative performance metrics: Implement Sharpe ratio, maximum drawdown, Sortino ratio, and other key statistics
  5. Handle large datasets efficiently: Process gigabytes of tick data without running out of memory
  6. Build the bridge to live trading: Understand how backtesting relates to paper trading and production systems
  7. Develop critical thinking about strategy validation: Recognize overfitting, lookahead bias, and survivorship bias

2. Theoretical Foundation

2.1 Core Concepts

What is Backtesting?

Backtesting is the process of testing a trading strategy on historical data to estimate how it would have performed in the past. It’s the scientific method applied to trading: form a hypothesis (strategy), test it against data (backtest), and evaluate results (performance metrics).

                    THE BACKTESTING LOOP
+------------------------------------------------------------------------+
|                                                                        |
|   IDEA                                                                 |
|     |                                                                  |
|     v                                                                  |
|   +------------------+                                                 |
|   | Strategy Logic   |                                                 |
|   | (Entry/Exit)     |                                                 |
|   +--------+---------+                                                 |
|            |                                                           |
|            v                                                           |
|   +------------------+     +------------------+                        |
|   | Historical Data  | --> | Backtest Engine  |                        |
|   | (Ticks/Bars)     |     | (Event Replay)   |                        |
|   +------------------+     +--------+---------+                        |
|                                     |                                  |
|                                     v                                  |
|                            +------------------+                        |
|                            | Fill Simulator   |                        |
|                            | (Realistic Exec) |                        |
|                            +--------+---------+                        |
|                                     |                                  |
|                                     v                                  |
|                            +------------------+                        |
|                            | Performance      |                        |
|                            | Calculator       |                        |
|                            +--------+---------+                        |
|                                     |                                  |
|                                     v                                  |
|                            +------------------+                        |
|                            | Results Analysis |                        |
|                            +------------------+                        |
|                                     |                                  |
|                +--------------------+--------------------+             |
|                |                    |                    |             |
|                v                    v                    v             |
|          Sharpe: 1.8          Max DD: -12%         Win Rate: 55%      |
|                                                                        |
|   If good: Proceed to paper trading                                    |
|   If bad: Refine strategy or discard                                   |
|                                                                        |
+------------------------------------------------------------------------+

Why Backtest?

  1. Risk-free validation: Test ideas without losing money
  2. Quick iteration: Years of market data in minutes
  3. Quantitative evaluation: Objective performance metrics
  4. Parameter optimization: Find optimal settings (with caution)
  5. Reality check: Many “obvious” strategies fail on historical data

Event-Driven Architecture

The backtester processes events in strict temporal order. Each event triggers strategy callbacks, which may generate orders. Orders are then filled according to the simulated market.

                    EVENT-DRIVEN DATA FLOW
+------------------------------------------------------------------------+
|                                                                        |
|  Historical Data                                                       |
|       |                                                                |
|       v                                                                |
|  +----------+   +----------+   +----------+   +----------+             |
|  | Event 1  |-->| Event 2  |-->| Event 3  |-->| Event N  |             |
|  | 09:30:01 |   | 09:30:02 |   | 09:30:03 |   | 15:59:59 |             |
|  | AAPL BID |   | AAPL ASK |   | MSFT BID |   | AAPL TRD |             |
|  +----+-----+   +----+-----+   +----+-----+   +----+-----+             |
|       |              |              |              |                   |
|       v              v              v              v                   |
|  +----------------------------------------------------------+         |
|  |                   EVENT DISPATCHER                        |         |
|  |                                                           |         |
|  |   - Maintains time-ordered event queue                    |         |
|  |   - Dispatches to registered callbacks                    |         |
|  |   - Ensures no future data leaks (lookahead prevention)   |         |
|  +----------------------------------------------------------+         |
|       |              |              |              |                   |
|       v              v              v              v                   |
|  +----------------------------------------------------------+         |
|  |                     STRATEGY                              |         |
|  |                                                           |         |
|  |   on_tick(symbol, bid, ask, timestamp)                    |         |
|  |   on_bar(symbol, ohlcv, timestamp)                        |         |
|  |   on_fill(order_id, fill_price, quantity)                 |         |
|  |                                                           |         |
|  |   Strategy sees ONLY data up to current event timestamp   |         |
|  +----+-----------------------------------------------------+         |
|       |                                                                |
|       v                                                                |
|  +----------------------------------------------------------+         |
|  |                  ORDER GENERATOR                          |         |
|  |                                                           |         |
|  |   Strategy emits:                                         |         |
|  |   - BUY(symbol, quantity, price_type, limit_price?)       |         |
|  |   - SELL(symbol, quantity, price_type, limit_price?)      |         |
|  |   - CANCEL(order_id)                                      |         |
|  +----+-----------------------------------------------------+         |
|       |                                                                |
|       v                                                                |
|  +----------------------------------------------------------+         |
|  |                  FILL SIMULATOR                           |         |
|  |                                                           |         |
|  |   - Models order execution                                |         |
|  |   - Applies slippage                                      |         |
|  |   - Handles partial fills                                 |         |
|  |   - Simulates latency                                     |         |
|  +----------------------------------------------------------+         |
|                                                                        |
+------------------------------------------------------------------------+

Key Properties of Event-Driven Design:

  1. Temporal correctness: Events processed in exact time order
  2. No lookahead bias: Strategy cannot see future events
  3. Realistic simulation: Orders interact with simulated market
  4. Strategy isolation: Strategy logic separated from infrastructure
  5. Extensibility: Easy to add new event types or strategies

Market Microstructure

Understanding how orders actually get filled is crucial for realistic backtesting.

                    ORDER EXECUTION IN THE REAL WORLD
+------------------------------------------------------------------------+
|                                                                        |
|   Your Order: BUY 1000 shares @ Market                                 |
|                                                                        |
|   Current Order Book (AAPL):                                           |
|   +------------------+------------------+                              |
|   |      BIDS        |      ASKS        |                              |
|   +------------------+------------------+                              |
|   | 150.24 x 500     | 150.26 x 300     | <-- Best Ask (you buy here) |
|   | 150.23 x 800     | 150.27 x 700     |                              |
|   | 150.22 x 1200    | 150.28 x 1000    |                              |
|   | 150.21 x 600     | 150.29 x 400     |                              |
|   +------------------+------------------+                              |
|                                                                        |
|   What Actually Happens:                                               |
|                                                                        |
|   1. First 300 shares filled @ 150.26   (cleared the best ask)         |
|   2. Next 700 shares filled @ 150.27    (moved up to next level)       |
|   3. Remaining 0 shares (filled completely)                            |
|                                                                        |
|   Execution Report:                                                    |
|   +------------------+------------------+------------------+           |
|   | Quantity         | Price            | Value            |           |
|   +------------------+------------------+------------------+           |
|   | 300              | $150.26          | $45,078          |           |
|   | 700              | $150.27          | $105,189         |           |
|   +------------------+------------------+------------------+           |
|   | Total: 1000      | VWAP: $150.267   | $150,267         |           |
|   +------------------+------------------+------------------+           |
|                                                                        |
|   vs. Naive Backtest (best price assumption):                          |
|   | 1000             | $150.26          | $150,260         |           |
|   +------------------+------------------+------------------+           |
|                                                                        |
|   Slippage: $150,267 - $150,260 = $7 (or 0.7 cents/share)              |
|   This happens on EVERY trade!                                         |
|                                                                        |
+------------------------------------------------------------------------+

Slippage is the difference between the expected price and the actual execution price. It has several causes:

  1. Market impact: Your order consumes liquidity at multiple price levels
  2. Latency: By the time your order arrives, the market has moved
  3. Spread crossing: Market orders pay the spread
  4. Queue position: Your limit order might not be first at that price
                    SLIPPAGE MODELING
+------------------------------------------------------------------------+
|                                                                        |
|   Components of Slippage:                                              |
|                                                                        |
|   1. SPREAD COST (always present for market orders)                    |
|      +-------------------------------------------+                     |
|      |                                           |                     |
|      |  Bid: 150.24          Ask: 150.26         |                     |
|      |           ^            ^                  |                     |
|      |           |  Spread    |                  |                     |
|      |           |  $0.02     |                  |                     |
|      |           |____________|                  |                     |
|      |                                           |                     |
|      |  Midpoint: 150.25                         |                     |
|      |  Buy at ask: +$0.01 per share             |                     |
|      |  Sell at bid: -$0.01 per share            |                     |
|      |                                           |                     |
|      +-------------------------------------------+                     |
|                                                                        |
|   2. MARKET IMPACT (larger orders, illiquid stocks)                    |
|      +-------------------------------------------+                     |
|      |                                           |                     |
|      |  Impact = k * sqrt(order_size / ADV)      |                     |
|      |                                           |                     |
|      |  k     = market-specific constant         |                     |
|      |  ADV   = Average Daily Volume             |                     |
|      |                                           |                     |
|      |  Example: 10,000 share order              |                     |
|      |           ADV = 1,000,000 shares          |                     |
|      |           k = 0.1                         |                     |
|      |           Impact = 0.1 * sqrt(0.01)       |                     |
|      |                  = 0.1 * 0.1 = 1%         |                     |
|      |                                           |                     |
|      +-------------------------------------------+                     |
|                                                                        |
|   3. LATENCY (order reaches exchange after price moves)                |
|      +-------------------------------------------+                     |
|      |                                           |                     |
|      |  Time T0: You decide to buy @ 150.25      |                     |
|      |  Time T1: Order reaches exchange          |                     |
|      |  Time T2: Order executed @ 150.28         |                     |
|      |                                           |                     |
|      |  Latency impact: $0.03/share              |                     |
|      |                                           |                     |
|      |  In HFT: latency is microseconds          |                     |
|      |  Retail: latency can be seconds           |                     |
|      |                                           |                     |
|      +-------------------------------------------+                     |
|                                                                        |
+------------------------------------------------------------------------+

Fill Simulation Logic

The fill simulator determines when and at what price orders execute:

                    FILL SIMULATION DECISION TREE
+------------------------------------------------------------------------+
|                                                                        |
|   Order arrives in simulator                                           |
|            |                                                           |
|            v                                                           |
|   +------------------+                                                 |
|   | What order type? |                                                 |
|   +--------+---------+                                                 |
|            |                                                           |
|   +--------+--------+--------+                                         |
|   |                 |        |                                         |
|   v                 v        v                                         |
|  MARKET           LIMIT    STOP                                        |
|   |                 |        |                                         |
|   v                 |        v                                         |
|  Fill at           |       Wait until                                  |
|  current           |       trigger price                               |
|  best price        |       then convert                                |
|  + slippage        |       to market                                   |
|   |                |        |                                          |
|   v                v        v                                          |
|  +------------------+      +------------------+                        |
|  | Calculate fill   |      | Wait for price   |                        |
|  | price with       |      | to reach limit   |                        |
|  | slippage model   |      | price            |                        |
|  +--------+---------+      +--------+---------+                        |
|           |                         |                                  |
|           v                         v                                  |
|  +------------------+      +------------------+                        |
|  | Check available  |      | Check queue      |                        |
|  | liquidity        |      | position         |                        |
|  +--------+---------+      +--------+---------+                        |
|           |                         |                                  |
|           v                         v                                  |
|  +------------------+      +------------------+                        |
|  | Partial or       |      | Filled? Y/N      |                        |
|  | complete fill?   |      |                  |                        |
|  +--------+---------+      +--------+---------+                        |
|           |                         |                                  |
|           v                         v                                  |
|  +--------------------------------------------------+                  |
|  |                  GENERATE FILL EVENT              |                  |
|  |                                                   |                  |
|  |  Fill {                                           |                  |
|  |    order_id: 12345,                               |                  |
|  |    symbol: "AAPL",                                |                  |
|  |    side: BUY,                                     |                  |
|  |    quantity: 100,                                 |                  |
|  |    price: 150.27,                                 |                  |
|  |    timestamp: "2024-01-15T09:30:01.123456Z",      |                  |
|  |    commission: 1.00,                              |                  |
|  |  }                                                |                  |
|  +--------------------------------------------------+                  |
|                                                                        |
+------------------------------------------------------------------------+

Fill Simulation Modes:

Mode Description Use Case
Optimistic Fill at best price, no slippage Quick validation, upper bound
Pessimistic Fill at worst price in bar Lower bound estimate
Volume-based Fill proportional to volume Realistic for liquid assets
Order book Simulate actual order book consumption Most realistic, requires L2 data

Performance Metrics

The metrics that quantify strategy quality:

                    KEY PERFORMANCE METRICS
+------------------------------------------------------------------------+
|                                                                        |
|  1. SHARPE RATIO                                                       |
|     ============                                                       |
|                                                                        |
|     Sharpe = (R_strategy - R_risk_free) / StdDev(R_strategy)           |
|                                                                        |
|     Where:                                                             |
|       R_strategy  = Mean return of strategy                            |
|       R_risk_free = Risk-free rate (e.g., T-bill rate, often ~0)       |
|       StdDev      = Standard deviation of returns                      |
|                                                                        |
|     Interpretation:                                                    |
|       < 1.0    Poor (not worth the risk)                               |
|       1.0-2.0  Good (tradable with caution)                            |
|       2.0-3.0  Excellent (institutional quality)                       |
|       > 3.0    Outstanding (rare, check for bugs!)                     |
|                                                                        |
|     Example:                                                           |
|       Daily returns: 0.1%, 0.2%, -0.1%, 0.3%, 0.05%                     |
|       Mean: 0.11%  StdDev: 0.15%  Risk-free: 0.01%                      |
|       Annualized: (0.11 - 0.01) / 0.15 * sqrt(252) = 10.6              |
|                                                                        |
+------------------------------------------------------------------------+
|                                                                        |
|  2. MAXIMUM DRAWDOWN                                                   |
|     ================                                                   |
|                                                                        |
|     Peak         +--------+                                            |
|                 /          \                                           |
|                /            \         Recovery                         |
|     +---------+              \        Peak                             |
|     |                         \      +-------+                         |
|     |                          \    /                                  |
|     |                           \  /                                   |
|     |           Drawdown         \/  Trough                            |
|     |            = 12%                                                 |
|     +-----------------------------------------------------> Time       |
|                                                                        |
|     Max Drawdown = (Trough - Peak) / Peak                              |
|                                                                        |
|     Interpretation:                                                    |
|       < 10%   Conservative                                             |
|       10-20%  Moderate                                                 |
|       20-30%  Aggressive                                               |
|       > 30%   Very risky (most investors will panic-sell)              |
|                                                                        |
+------------------------------------------------------------------------+
|                                                                        |
|  3. SORTINO RATIO                                                      |
|     =============                                                      |
|                                                                        |
|     Like Sharpe, but only penalizes downside volatility                |
|                                                                        |
|     Sortino = (R_strategy - R_target) / Downside_Deviation             |
|                                                                        |
|     Where:                                                             |
|       Downside_Deviation = StdDev of returns below target              |
|                                                                        |
|     Why better than Sharpe?                                            |
|       Sharpe penalizes upside volatility (which is good!)              |
|       Sortino only penalizes losses                                    |
|                                                                        |
+------------------------------------------------------------------------+
|                                                                        |
|  4. OTHER METRICS                                                      |
|     =============                                                      |
|                                                                        |
|     Win Rate:       # winning trades / # total trades                  |
|     Profit Factor:  Gross profit / Gross loss                          |
|     Calmar Ratio:   Annual return / Max drawdown                       |
|     CAGR:           Compound Annual Growth Rate                        |
|     Average Trade:  Net profit / # trades                              |
|     Holding Period: Average time in position                           |
|                                                                        |
+------------------------------------------------------------------------+

2.2 Why This Matters

You Cannot Trade Without Backtesting.

Every serious trader and quant fund backtests strategies before deployment. The alternative is gambling with real money:

                    THE COST OF NOT BACKTESTING
+------------------------------------------------------------------------+
|                                                                        |
|   Scenario: You have a "great idea" for a trading strategy             |
|                                                                        |
|   Without Backtesting:                                                 |
|   +----------------------------------------------------------+        |
|   |                                                          |        |
|   |  Day 1:  Deploy with $100,000                            |        |
|   |  Day 5:  Down 8%, but "it's just noise"                  |        |
|   |  Day 10: Down 15%, start to worry                        |        |
|   |  Day 20: Down 25%, panic and close positions             |        |
|   |  Result: -$25,000 and emotional trauma                   |        |
|   |                                                          |        |
|   |  You never knew the strategy had a 40% drawdown in 2018  |        |
|   |                                                          |        |
|   +----------------------------------------------------------+        |
|                                                                        |
|   With Backtesting:                                                    |
|   +----------------------------------------------------------+        |
|   |                                                          |        |
|   |  Before deploying:                                       |        |
|   |  - Run backtest on 10 years of data                      |        |
|   |  - Discover Sharpe ratio is only 0.3                     |        |
|   |  - See 40% drawdown in 2018                              |        |
|   |  - Expected win rate: 32%                                |        |
|   |                                                          |        |
|   |  Decision: Refine strategy or don't trade it             |        |
|   |  Cost: 2 hours of compute time                           |        |
|   |                                                          |        |
|   +----------------------------------------------------------+        |
|                                                                        |
|   The backtest just saved you $25,000.                                 |
|                                                                        |
+------------------------------------------------------------------------+

Real-World Usage:

User How They Use Backtesting
Quant Funds Run millions of backtests to find alpha
Retail Traders Validate strategy ideas before live trading
Academics Research market behavior and inefficiencies
Regulators Reconstruct trading behavior for investigations
Risk Managers Stress-test portfolios against historical scenarios

2.3 Historical Context

The evolution of quantitative trading and backtesting:

                    HISTORY OF QUANTITATIVE TRADING
+------------------------------------------------------------------------+
|                                                                        |
|  1950s-1960s: ACADEMIC FOUNDATIONS                                     |
|  ================================                                      |
|  - Harry Markowitz: Modern Portfolio Theory (1952)                     |
|  - Efficient Market Hypothesis emerges                                 |
|  - Computing is expensive and primitive                                |
|                                                                        |
|  1970s-1980s: EARLY QUANTS                                             |
|  =========================                                             |
|  - Black-Scholes options pricing (1973)                                |
|  - Ed Thorp applies math to markets (Princeton Newport Partners)       |
|  - Renaissance Technologies founded (1982)                             |
|  - First commercial backtesting systems                                |
|                                                                        |
|  1990s: EXPLOSION OF DATA                                              |
|  ========================                                              |
|  - Tick-level data becomes available                                   |
|  - Long-Term Capital Management rises and falls                        |
|  - Excel-based backtesting is common                                   |
|  - TradeStation and MetaStock for retail                               |
|                                                                        |
|  2000s: ALGORITHMIC TRADING GOES MAINSTREAM                            |
|  ==========================================                            |
|  - High-frequency trading emerges                                      |
|  - Python becomes quant's language                                     |
|  - QuantLib, zipline, and open-source tools                            |
|  - Flash Crash (2010) highlights algo risks                            |
|                                                                        |
|  2010s-2020s: MACHINE LEARNING ERA                                     |
|  =================================                                     |
|  - ML applied to alpha discovery                                       |
|  - Alternative data (satellites, sentiment)                            |
|  - Cloud computing enables massive backtests                           |
|  - Retail platforms (QuantConnect, Alpaca)                             |
|  - Overfitting becomes major concern                                   |
|                                                                        |
|  TODAY: EVERYONE IS A QUANT                                            |
|  ==========================                                            |
|  - Free data, free tools, free education                               |
|  - Competition for alpha is intense                                    |
|  - Sophisticated backtesting is table stakes                           |
|                                                                        |
+------------------------------------------------------------------------+

2.4 Common Misconceptions

Misconception 1: “A profitable backtest means the strategy will work live”

Reality: Backtests have many biases that inflate performance. A strategy that made 50% per year in backtest might lose money live due to:

  • Lookahead bias (accidentally using future data)
  • Survivorship bias (only testing on stocks that still exist)
  • Overfitting (curve-fitting to historical data)
  • Execution assumptions (getting fills that wouldn’t happen in reality)

Misconception 2: “High Sharpe ratio means low risk”

Reality: Sharpe ratio measures historical risk-adjusted return. It doesn’t account for:

  • Tail risk (rare but catastrophic events)
  • Regime changes (strategy works until it doesn’t)
  • Model risk (your assumptions might be wrong)

A strategy with Sharpe 3.0 can still lose 50% in a crisis.

Misconception 3: “More data is always better”

Reality: Using 30 years of data includes many regime changes:

  • Market structure has changed (decimalization, HFT, ETFs)
  • Regulations have changed (Reg NMS, circuit breakers)
  • Technology has changed (retail access, information speed)

Often 3-5 years of recent data is more relevant than 30 years of stale data.

Misconception 4: “Slippage doesn’t matter for a long-term strategy”

Reality: If you trade 100 times per year with $0.05 slippage per share on 1000-share trades:

Annual slippage = 100 * 1000 * $0.05 = $5,000
On a $100,000 account, that's 5% annual drag
A 10% annual return becomes 5% after slippage

Slippage always matters.

Misconception 5: “Simple fill simulation is good enough”

Reality: Simple fill simulation (fill at best price) can make unprofitable strategies look profitable. A strategy that “buys the dip” might look great in backtest but fails live because:

  • In the backtest, it gets filled at the bottom tick
  • In reality, you’re competing with HFTs for those fills
  • Your order itself moves the price

3. Project Specification

3.1 What You Will Build

A comprehensive backtesting engine that:

  1. Loads and replays historical tick/bar data in strict temporal order
  2. Provides a strategy interface with callbacks for market events
  3. Simulates order execution with configurable slippage and latency
  4. Calculates performance metrics (Sharpe, drawdown, P&L curves)
  5. Supports multiple strategies for side-by-side comparison
  6. Generates detailed trade logs for post-analysis
  7. Handles large datasets efficiently using streaming or memory-mapped files

3.2 Functional Requirements

ID Requirement Priority
FR1 Load tick data from CSV/binary files Must Have
FR2 Replay events in chronological order Must Have
FR3 Strategy interface with on_tick(), on_bar(), on_fill() callbacks Must Have
FR4 Support market and limit orders Must Have
FR5 Calculate P&L per trade and cumulative Must Have
FR6 Calculate Sharpe ratio and max drawdown Must Have
FR7 Apply configurable slippage model Must Have
FR8 Generate trade log with all executions Must Have
FR9 Support multiple symbols simultaneously Should Have
FR10 Support multiple strategies for comparison Should Have
FR11 Simulate order latency Should Have
FR12 Handle partial fills based on volume Nice to Have
FR13 Memory-efficient streaming for large files Nice to Have

3.3 Non-Functional Requirements

Requirement Target Measurement
Data throughput 1M+ ticks/second Benchmark with large dataset
Memory usage < 1GB for 1-year tick data Monitor with valgrind/heaptrack
Startup time < 5 seconds for 1GB data Time from start to first event
Accuracy P&L matches manual calculation Unit tests on known scenarios
Reproducibility Same data = same results Run twice, compare outputs

3.4 Example Usage / Output

$ ./backtester --data data/AAPL_2023.csv \
               --strategy mean_reversion \
               --capital 100000 \
               --slippage 0.001 \
               --output results/

================================================================================
                    TRADING STRATEGY BACKTESTER
================================================================================

Configuration:
  Data file:      data/AAPL_2023.csv
  Symbol:         AAPL
  Strategy:       Mean Reversion (20-day)
  Initial capital: $100,000.00
  Slippage model: 0.1% per trade
  Date range:     2023-01-03 to 2023-12-29

Loading data...
  Loaded 2,340,567 ticks in 1.23 seconds

Running backtest...
  [========================================] 100% | 2,340,567 events

================================================================================
                         PERFORMANCE SUMMARY
================================================================================

  Initial Capital:     $100,000.00
  Final Capital:       $118,456.23
  Net Profit:          $18,456.23 (+18.46%)

  Sharpe Ratio:        1.82 (annualized)
  Sortino Ratio:       2.34
  Max Drawdown:        -8.67% ($9,234.12)
  Calmar Ratio:        2.13

  Total Trades:        156
  Winning Trades:      89 (57.05%)
  Losing Trades:       67 (42.95%)
  Avg Winner:          $412.34
  Avg Loser:           -$245.67
  Profit Factor:       1.67

  Avg Trade Duration:  3.2 days
  Max Consecutive Win: 7
  Max Consecutive Loss: 4

================================================================================
                         MONTHLY RETURNS
================================================================================

  Jan:   +2.34%   Feb:   +1.89%   Mar:   -0.45%   Apr:   +3.12%
  May:   +1.56%   Jun:   +2.78%   Jul:   +1.23%   Aug:   -1.34%
  Sep:   +0.89%   Oct:   +2.45%   Nov:   +1.67%   Dec:   +2.12%

================================================================================
                         P&L CURVE (ASCII)
================================================================================

  $120k |                                              ....***
  $115k |                                    ...****'''
  $110k |                           ....****'
  $105k |              .....'''''***
  $100k |**'''''''.....'
   $95k |    ...
        +-----------------------------------------------------------> Time
        Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct    Nov    Dec

================================================================================
                         TRADE LOG (Last 10 Trades)
================================================================================

  ID    | Date       | Side | Qty  | Entry    | Exit     | P&L      | Duration
  ------|------------|------|------|----------|----------|----------|----------
  147   | 2023-12-05 | LONG | 100  | $189.23  | $192.45  | +$322.00 | 2d
  148   | 2023-12-08 | SHORT| 100  | $191.67  | $189.34  | +$233.00 | 1d
  149   | 2023-12-11 | LONG | 100  | $188.89  | $190.12  | +$123.00 | 1d
  150   | 2023-12-13 | SHORT| 100  | $191.23  | $193.45  | -$222.00 | 2d
  151   | 2023-12-15 | LONG | 100  | $192.34  | $193.67  | +$133.00 | 1d
  152   | 2023-12-18 | SHORT| 100  | $194.56  | $192.23  | +$233.00 | 1d
  153   | 2023-12-20 | LONG | 100  | $191.45  | $194.78  | +$333.00 | 2d
  154   | 2023-12-22 | SHORT| 100  | $195.23  | $193.12  | +$211.00 | 1d
  155   | 2023-12-26 | LONG | 100  | $192.67  | $194.89  | +$222.00 | 2d
  156   | 2023-12-28 | SHORT| 100  | $195.45  | $193.23  | +$222.00 | 1d

================================================================================
                         FILES GENERATED
================================================================================

  results/performance_summary.json    - Full metrics in JSON format
  results/trades.csv                  - Complete trade log
  results/equity_curve.csv            - Daily equity values
  results/pnl_by_trade.csv           - P&L for each trade

================================================================================

3.5 Real World Outcome

When complete, you will have:

  1. A working backtesting engine capable of processing millions of ticks and generating production-quality performance reports

  2. Understanding of backtest pitfalls that will make you skeptical of any backtest (including your own) - a crucial skill for any quant

  3. Strategy development foundation that you can use to test any trading idea quickly

  4. Portfolio-ready project demonstrating quantitative finance knowledge

  5. Bridge to live trading - the same strategy interface can connect to a paper or live trading system


4. Solution Architecture

4.1 High-Level Design

                    BACKTESTER ARCHITECTURE
+------------------------------------------------------------------------+
|                                                                        |
|  +----------------------------------------------------------------+   |
|  |                        DATA LAYER                               |   |
|  |                                                                 |   |
|  |  +--------------+    +----------------+    +----------------+   |   |
|  |  | CSV Parser   |    | Binary Parser  |    | Memory-Mapped  |   |   |
|  |  | (flexible)   |    | (fast)         |    | (huge files)   |   |   |
|  |  +------+-------+    +-------+--------+    +-------+--------+   |   |
|  |         |                    |                     |            |   |
|  |         +--------------------+---------------------+            |   |
|  |                              |                                  |   |
|  |                              v                                  |   |
|  |                    +-----------------+                          |   |
|  |                    | Event Iterator  |                          |   |
|  |                    | (time-ordered)  |                          |   |
|  |                    +--------+--------+                          |   |
|  |                             |                                   |   |
|  +-----------------------------+-----------------------------------+   |
|                                |                                       |
|                                v                                       |
|  +-----------------------------+-----------------------------------+   |
|  |                        ENGINE LAYER                             |   |
|  |                                                                 |   |
|  |  +------------------+                  +--------------------+   |   |
|  |  | Event Dispatcher |----------------->| Strategy Manager   |   |   |
|  |  |                  |                  |                    |   |   |
|  |  | - Time tracking  |                  | - Multiple strats  |   |   |
|  |  | - Event routing  |                  | - Callback routing |   |   |
|  |  +------------------+                  +----------+---------+   |   |
|  |                                                   |             |   |
|  |                              +--------------------+             |   |
|  |                              |                                  |   |
|  |                              v                                  |   |
|  |                    +------------------+                         |   |
|  |                    |    Strategy      |                         |   |
|  |                    |    Interface     |                         |   |
|  |                    |                  |                         |   |
|  |                    | - on_tick()      |                         |   |
|  |                    | - on_bar()       |                         |   |
|  |                    | - on_fill()      |                         |   |
|  |                    +--------+---------+                         |   |
|  |                             |                                   |   |
|  |                     Orders  |                                   |   |
|  |                             v                                   |   |
|  |                    +------------------+                         |   |
|  |                    | Order Manager    |                         |   |
|  |                    |                  |                         |   |
|  |                    | - Order tracking |                         |   |
|  |                    | - Order ID gen   |                         |   |
|  |                    +--------+---------+                         |   |
|  |                             |                                   |   |
|  |                             v                                   |   |
|  |                    +------------------+                         |   |
|  |                    | Fill Simulator   |                         |   |
|  |                    |                  |                         |   |
|  |                    | - Slippage model |                         |   |
|  |                    | - Partial fills  |                         |   |
|  |                    | - Latency sim    |                         |   |
|  |                    +--------+---------+                         |   |
|  |                             |                                   |   |
|  +-----------------------------+-----------------------------------+   |
|                                |                                       |
|                         Fills  |                                       |
|                                v                                       |
|  +-----------------------------+-----------------------------------+   |
|  |                      ANALYSIS LAYER                             |   |
|  |                                                                 |   |
|  |  +------------------+    +------------------+    +-----------+  |   |
|  |  | Position Tracker |    | P&L Calculator   |    | Metrics   |  |   |
|  |  | (per symbol)     |    | (per trade/day)  |    | Engine    |  |   |
|  |  +------------------+    +------------------+    +-----------+  |   |
|  |                                                                 |   |
|  |  +------------------+    +------------------+    +-----------+  |   |
|  |  | Trade Logger     |    | Equity Curve     |    | Report    |  |   |
|  |  | (all fills)      |    | Generator        |    | Generator |  |   |
|  |  +------------------+    +------------------+    +-----------+  |   |
|  |                                                                 |   |
|  +----------------------------------------------------------------+   |
|                                                                        |
+------------------------------------------------------------------------+

4.2 Key Components

Component Responsibility Performance Requirement
Data Parser Read tick/bar data from files 1M+ records/second
Event Iterator Provide time-ordered event stream O(1) per next event
Event Dispatcher Route events to strategies O(1) per event
Strategy Interface Define strategy callbacks N/A (user code)
Order Manager Track pending/filled orders O(1) lookup by ID
Fill Simulator Determine execution prices O(1) per order
Position Tracker Track holdings per symbol O(1) per update
P&L Calculator Calculate profit/loss O(1) per trade
Metrics Engine Compute Sharpe, drawdown, etc. O(n) for n data points

4.3 Data Structures

                    CORE DATA STRUCTURES
+------------------------------------------------------------------------+
|                                                                        |
|  TICK DATA (Market Event)                                              |
|  +------------------------------------------------------------------+  |
|  | timestamp: i64      // Nanoseconds since epoch                   |  |
|  | symbol: [u8; 8]     // Symbol (null-padded)                      |  |
|  | bid_price: f64      // Best bid price                            |  |
|  | bid_size: u32       // Best bid quantity                         |  |
|  | ask_price: f64      // Best ask price                            |  |
|  | ask_size: u32       // Best ask quantity                         |  |
|  | last_price: f64     // Last trade price (optional)               |  |
|  | last_size: u32      // Last trade quantity (optional)            |  |
|  +------------------------------------------------------------------+  |
|  | Size: 56 bytes (fits in one cache line with alignment)           |  |
|  +------------------------------------------------------------------+  |
|                                                                        |
|  BAR DATA (OHLCV)                                                      |
|  +------------------------------------------------------------------+  |
|  | timestamp: i64      // Bar close time                            |  |
|  | symbol: [u8; 8]     // Symbol                                    |  |
|  | open: f64           // Opening price                             |  |
|  | high: f64           // Highest price in bar                      |  |
|  | low: f64            // Lowest price in bar                       |  |
|  | close: f64          // Closing price                             |  |
|  | volume: u64         // Total volume in bar                       |  |
|  +------------------------------------------------------------------+  |
|  | Size: 64 bytes (one cache line)                                  |  |
|  +------------------------------------------------------------------+  |
|                                                                        |
|  ORDER                                                                 |
|  +------------------------------------------------------------------+  |
|  | order_id: u64       // Unique identifier                         |  |
|  | symbol: [u8; 8]     // Symbol                                    |  |
|  | side: Side          // BUY or SELL (enum)                        |  |
|  | order_type: Type    // MARKET, LIMIT, STOP                       |  |
|  | quantity: u32       // Order quantity                            |  |
|  | limit_price: f64    // Limit price (if applicable)               |  |
|  | created_at: i64     // Order creation timestamp                  |  |
|  | status: Status      // PENDING, PARTIAL, FILLED, CANCELLED       |  |
|  | filled_qty: u32     // Quantity already filled                   |  |
|  | filled_price: f64   // Average fill price so far                 |  |
|  +------------------------------------------------------------------+  |
|                                                                        |
|  FILL                                                                  |
|  +------------------------------------------------------------------+  |
|  | fill_id: u64        // Unique fill identifier                    |  |
|  | order_id: u64       // Parent order ID                           |  |
|  | symbol: [u8; 8]     // Symbol                                    |  |
|  | side: Side          // BUY or SELL                               |  |
|  | quantity: u32       // Fill quantity                             |  |
|  | price: f64          // Execution price                           |  |
|  | timestamp: i64      // Fill timestamp                            |  |
|  | commission: f64     // Commission charged                        |  |
|  +------------------------------------------------------------------+  |
|                                                                        |
|  POSITION                                                              |
|  +------------------------------------------------------------------+  |
|  | symbol: [u8; 8]     // Symbol                                    |  |
|  | quantity: i64       // Signed qty (positive=long, negative=short)|  |
|  | avg_price: f64      // Average entry price                       |  |
|  | realized_pnl: f64   // Cumulative realized P&L                   |  |
|  | unrealized_pnl: f64 // Current unrealized P&L                    |  |
|  +------------------------------------------------------------------+  |
|                                                                        |
|  PERFORMANCE METRICS                                                   |
|  +------------------------------------------------------------------+  |
|  | total_pnl: f64          // Net profit/loss                       |  |
|  | sharpe_ratio: f64       // Annualized Sharpe                     |  |
|  | sortino_ratio: f64      // Annualized Sortino                    |  |
|  | max_drawdown: f64       // Maximum drawdown (as decimal)         |  |
|  | total_trades: u32       // Number of completed trades            |  |
|  | winning_trades: u32     // Number of profitable trades           |  |
|  | gross_profit: f64       // Sum of winning trade profits          |  |
|  | gross_loss: f64         // Sum of losing trade losses            |  |
|  | avg_trade: f64          // Average profit per trade              |  |
|  | profit_factor: f64      // Gross profit / Gross loss             |  |
|  | returns: Vec<f64>       // Time series of returns                |  |
|  | equity_curve: Vec<f64>  // Time series of portfolio value        |  |
|  +------------------------------------------------------------------+  |
|                                                                        |
+------------------------------------------------------------------------+

4.4 Algorithm Overview

Main Event Loop:

                    BACKTEST EVENT LOOP
+------------------------------------------------------------------------+
|                                                                        |
|  fn run_backtest(data_source, strategies, config) -> Results:          |
|                                                                        |
|      initialize()                                                      |
|          - Set initial capital                                         |
|          - Initialize position trackers                                |
|          - Initialize metric collectors                                |
|                                                                        |
|      for event in data_source.iter_chronological():                    |
|                                                                        |
|          // Update simulated clock                                     |
|          current_time = event.timestamp                                |
|                                                                        |
|          // Process any pending orders that should fill                |
|          for order in pending_orders:                                  |
|              if should_fill(order, event):                             |
|                  fill = execute_order(order, event)                    |
|                  update_position(fill)                                 |
|                  notify_strategy(strategy, fill)                       |
|                                                                        |
|          // Notify strategies of market event                          |
|          for strategy in strategies:                                   |
|              match event.type:                                         |
|                  Tick => strategy.on_tick(event)                       |
|                  Bar  => strategy.on_bar(event)                        |
|                                                                        |
|              // Process any new orders from strategy                   |
|              for order in strategy.pending_orders():                   |
|                  validate_order(order)                                 |
|                  add_to_pending(order)                                 |
|                                                                        |
|          // Update unrealized P&L with current prices                  |
|          update_unrealized_pnl(event)                                  |
|                                                                        |
|          // Record equity snapshot                                     |
|          record_equity(current_time, total_equity())                   |
|                                                                        |
|      // Backtest complete - calculate final metrics                    |
|      return calculate_metrics()                                        |
|                                                                        |
+------------------------------------------------------------------------+

P&L Calculation:

                    P&L CALCULATION FLOW
+------------------------------------------------------------------------+
|                                                                        |
|  On each FILL:                                                         |
|                                                                        |
|  +-------------------------------------------------------------+      |
|  |                                                             |      |
|  |  if fill.side == BUY:                                       |      |
|  |                                                             |      |
|  |      if position.quantity >= 0:                             |      |
|  |          // Opening or adding to long position              |      |
|  |          new_avg = (position.quantity * position.avg_price  |      |
|  |                   + fill.quantity * fill.price)             |      |
|  |                   / (position.quantity + fill.quantity)     |      |
|  |          position.avg_price = new_avg                       |      |
|  |          position.quantity += fill.quantity                 |      |
|  |                                                             |      |
|  |      else:                                                  |      |
|  |          // Covering short position                         |      |
|  |          covered = min(fill.quantity, abs(position.quantity))|      |
|  |          realized = covered * (position.avg_price - fill.price)|    |
|  |          position.realized_pnl += realized                  |      |
|  |          position.quantity += fill.quantity                 |      |
|  |                                                             |      |
|  |  elif fill.side == SELL:                                    |      |
|  |      (mirror logic for selling)                             |      |
|  |                                                             |      |
|  +-------------------------------------------------------------+      |
|                                                                        |
|  Unrealized P&L (updated on each tick):                                |
|                                                                        |
|  +-------------------------------------------------------------+      |
|  |                                                             |      |
|  |  for symbol in positions:                                   |      |
|  |      current_price = latest_price[symbol]                   |      |
|  |                                                             |      |
|  |      if position.quantity > 0:  // Long                     |      |
|  |          unrealized = position.quantity *                   |      |
|  |                       (current_price - position.avg_price)  |      |
|  |                                                             |      |
|  |      elif position.quantity < 0:  // Short                  |      |
|  |          unrealized = abs(position.quantity) *              |      |
|  |                       (position.avg_price - current_price)  |      |
|  |                                                             |      |
|  |      position.unrealized_pnl = unrealized                   |      |
|  |                                                             |      |
|  +-------------------------------------------------------------+      |
|                                                                        |
+------------------------------------------------------------------------+

Sharpe Ratio Calculation:

                    SHARPE RATIO CALCULATION
+------------------------------------------------------------------------+
|                                                                        |
|  Input: daily_returns = [r1, r2, r3, ..., rn]                          |
|         risk_free_rate = 0.0 (or current T-bill rate)                  |
|                                                                        |
|  Step 1: Calculate excess returns                                      |
|  +-------------------------------------------------------------+      |
|  |  daily_rf = risk_free_rate / 252  // Annualized to daily    |      |
|  |  excess_returns = [r - daily_rf for r in daily_returns]     |      |
|  +-------------------------------------------------------------+      |
|                                                                        |
|  Step 2: Calculate mean and standard deviation                         |
|  +-------------------------------------------------------------+      |
|  |  mean_excess = sum(excess_returns) / len(excess_returns)    |      |
|  |                                                             |      |
|  |  variance = sum((r - mean_excess)^2 for r in excess_returns)|      |
|  |             / (len(excess_returns) - 1)                     |      |
|  |                                                             |      |
|  |  std_dev = sqrt(variance)                                   |      |
|  +-------------------------------------------------------------+      |
|                                                                        |
|  Step 3: Calculate and annualize Sharpe                                |
|  +-------------------------------------------------------------+      |
|  |  daily_sharpe = mean_excess / std_dev                       |      |
|  |  annual_sharpe = daily_sharpe * sqrt(252)                   |      |
|  +-------------------------------------------------------------+      |
|                                                                        |
|  Return: annual_sharpe                                                 |
|                                                                        |
+------------------------------------------------------------------------+

5. Implementation Guide

5.1 Development Environment Setup

Rust Setup:

# Create project
cargo new backtester --lib
cd backtester

# Add dependencies to Cargo.toml
# [dependencies]
# chrono = "0.4"           # Date/time handling
# csv = "1.3"              # CSV parsing
# serde = { version = "1.0", features = ["derive"] }  # Serialization
# serde_json = "1.0"       # JSON output
# thiserror = "1.0"        # Error handling
#
# [dev-dependencies]
# criterion = "0.5"        # Benchmarking
# approx = "0.5"           # Floating-point comparison in tests

# Build and test
cargo build --release
cargo test

C++ Setup:

# Create project structure
mkdir backtester && cd backtester
mkdir src include tests data examples

# CMakeLists.txt should include:
# - C++17 or later
# - Find CSV parser library or include header-only
# - JSON library (nlohmann/json)

# Example CMakeLists.txt additions:
# find_package(nlohmann_json 3.2.0 REQUIRED)
# target_link_libraries(backtester PRIVATE nlohmann_json::nlohmann_json)

Python Setup (for comparison/prototyping):

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install pandas numpy matplotlib

# Python is great for prototyping strategies
# but the backtester itself should be in Rust/C++ for performance

5.2 Project Structure

Rust Structure:

backtester/
+-- Cargo.toml
+-- src/
|   +-- lib.rs               # Public API
|   +-- data/
|   |   +-- mod.rs           # Data module
|   |   +-- tick.rs          # Tick data structures
|   |   +-- bar.rs           # Bar data structures
|   |   +-- loader.rs        # CSV/binary file loading
|   +-- engine/
|   |   +-- mod.rs           # Engine module
|   |   +-- backtester.rs    # Main backtest loop
|   |   +-- order_manager.rs # Order tracking
|   |   +-- fill_simulator.rs # Fill simulation
|   +-- strategy/
|   |   +-- mod.rs           # Strategy trait
|   |   +-- example_strategies/ # Sample strategies
|   +-- analysis/
|   |   +-- mod.rs           # Analysis module
|   |   +-- position.rs      # Position tracking
|   |   +-- pnl.rs           # P&L calculation
|   |   +-- metrics.rs       # Performance metrics
|   +-- output/
|       +-- mod.rs           # Output module
|       +-- report.rs        # Report generation
|       +-- csv_writer.rs    # Trade log output
+-- tests/
|   +-- integration_tests.rs
|   +-- metric_tests.rs
+-- examples/
|   +-- simple_backtest.rs
|   +-- multi_strategy.rs
+-- data/
    +-- sample_ticks.csv

C++ Structure:

backtester/
+-- CMakeLists.txt
+-- include/
|   +-- data/
|   |   +-- tick.hpp
|   |   +-- bar.hpp
|   |   +-- loader.hpp
|   +-- engine/
|   |   +-- backtester.hpp
|   |   +-- order_manager.hpp
|   |   +-- fill_simulator.hpp
|   +-- strategy/
|   |   +-- strategy.hpp     # Abstract base class
|   +-- analysis/
|   |   +-- position.hpp
|   |   +-- pnl.hpp
|   |   +-- metrics.hpp
+-- src/
|   +-- (implementations)
+-- tests/
+-- examples/
+-- data/

5.3 The Core Question You’re Answering

“How do you simulate trading in a way that is both fast enough to test many strategies and realistic enough to be predictive of live performance?”

This question has two conflicting requirements:

  1. Speed: You want to test thousands of parameter combinations quickly
  2. Realism: You want results that match what would happen in live trading

The tension between these goals is the core design challenge:

Approach Speed Realism Use Case
Fill at close Very fast Poor Quick screening
Fill at next open Fast Moderate Daily strategies
Volume-weighted fills Moderate Good Intraday strategies
Full order book sim Slow Excellent HFT research

Your backtester should support multiple fill modes so users can choose the appropriate tradeoff.

5.4 Concepts You Must Understand First

Before starting implementation, verify your understanding:

Concept Self-Assessment Question Where to Learn
Time series data How do you iterate through a million records in chronological order efficiently? Any data structures book
Floating-point Why shouldn’t you compare prices with ==? “What Every Computer Scientist Should Know About Floating-Point Arithmetic”
P&L calculation How do you calculate the profit when closing a partial position? Any trading book
Statistical metrics How is Sharpe ratio calculated and annualized? “Active Portfolio Management” by Grinold & Kahn
Event-driven design What is the observer pattern and why is it useful here? “Design Patterns” (Gang of Four)
Market microstructure What is the difference between bid, ask, and midpoint? “Trading and Exchanges” by Larry Harris

5.5 Questions to Guide Your Design

Data Handling:

  1. How will you handle missing data (gaps in the time series)?
  2. How will you handle corporate actions (splits, dividends)?
  3. How will you efficiently load data that doesn’t fit in memory?
  4. How will you handle multiple symbols with different timestamps?

Event Processing:

  1. How will you merge events from multiple symbols into one timeline?
  2. How will you prevent lookahead bias in your design?
  3. How will you handle events that occur at the exact same timestamp?

Order Execution:

  1. What happens when a limit order’s price is between bid and ask?
  2. How will you model the queue position for limit orders?
  3. How will you simulate latency between order submission and execution?

Metrics:

  1. How will you handle the risk-free rate in Sharpe calculation?
  2. How will you handle strategies that don’t trade every day in Sharpe?
  3. How will you handle short positions in drawdown calculation?

5.6 Thinking Exercise

Before writing code, work through this scenario by hand:

Scenario:

You start with $100,000. You’re backtesting a simple mean-reversion strategy on AAPL. Here are the events:

Day 1, 09:30: AAPL bid=150.00, ask=150.02
Day 1, 10:00: Strategy says BUY 100 shares MARKET
              (Fill at 150.02 + 0.05 slippage = 150.07)
Day 1, 15:00: AAPL bid=149.00, ask=149.02

Day 2, 09:30: AAPL bid=148.00, ask=148.02
Day 2, 10:00: Strategy says BUY 100 more shares MARKET
              (Fill at 148.02 + 0.05 slippage = 148.07)
Day 2, 15:00: AAPL bid=151.00, ask=151.02

Day 3, 09:30: AAPL bid=152.00, ask=152.02
Day 3, 10:00: Strategy says SELL 200 shares MARKET
              (Fill at 152.00 - 0.05 slippage = 151.95)
Day 3, 15:00: Backtest ends

Questions to answer on paper:

  1. After Day 1’s buy, what is your position? What is your average price?
  2. After Day 1’s close, what is your unrealized P&L?
  3. After Day 2’s buy, what is your new average price?
  4. After Day 2’s close, what is your unrealized P&L?
  5. After Day 3’s sell, what is your realized P&L? (Remember commissions)
  6. What were your daily returns (for Sharpe calculation)?

Answers to verify:

  1. Position: 100 shares, Avg price: $150.07
  2. Day 1 unrealized: 100 * (149.00 - 150.07) = -$107.00
  3. New avg: (100 * 150.07 + 100 * 148.07) / 200 = $149.07
  4. Day 2 unrealized: 200 * (151.00 - 149.07) = $386.00
  5. Realized P&L: 200 * (151.95 - 149.07) = $576.00 (ignoring commission)
  6. Day 1: -107/100000 = -0.107%, Day 2: (386 - (-107))/99893 = +0.493%, Day 3: (576 - 386)/100279 = +0.190%

5.7 Hints in Layers

Use these hints progressively - only look at the next level when stuck.

Hint 1 - Starting Point

Start with the simplest possible implementation:

  1. Load all data into memory as a vector of events
  2. Sort by timestamp
  3. Iterate through, calling strategy for each event
  4. Use a simple hashmap to track positions
  5. Calculate metrics at the end

Don’t worry about efficiency initially - get correctness first.

// Simple event structure
struct TickEvent {
    timestamp: i64,
    symbol: String,
    bid: f64,
    ask: f64,
}

// Simple position tracking
struct Position {
    quantity: i64,
    avg_price: f64,
}

// Strategy trait
trait Strategy {
    fn on_tick(&mut self, event: &TickEvent) -> Vec<Order>;
}
Hint 2 - Fill Simulation

For market orders, implement a simple slippage model:

fn simulate_fill(order: &Order, market: &TickEvent, slippage: f64) -> Fill {
    let base_price = match order.side {
        Side::Buy => market.ask,   // Pay the ask for buys
        Side::Sell => market.bid,  // Receive the bid for sells
    };

    // Apply slippage (always against you)
    let fill_price = match order.side {
        Side::Buy => base_price * (1.0 + slippage),
        Side::Sell => base_price * (1.0 - slippage),
    };

    Fill {
        order_id: order.id,
        price: fill_price,
        quantity: order.quantity,
        timestamp: market.timestamp,
    }
}

For limit orders, check if the price is reachable:

fn can_fill_limit(order: &Order, market: &TickEvent) -> bool {
    match order.side {
        Side::Buy => market.ask <= order.limit_price,
        Side::Sell => market.bid >= order.limit_price,
    }
}
Hint 3 - Metrics Calculation

Calculate returns correctly for Sharpe:

fn calculate_daily_returns(equity_curve: &[(i64, f64)]) -> Vec<f64> {
    // Group by day, take end-of-day equity
    let daily_equity: Vec<(Date, f64)> = group_by_day(equity_curve);

    // Calculate returns
    let mut returns = Vec::new();
    for i in 1..daily_equity.len() {
        let prev = daily_equity[i-1].1;
        let curr = daily_equity[i].1;
        let daily_return = (curr - prev) / prev;
        returns.push(daily_return);
    }
    returns
}

fn calculate_sharpe(returns: &[f64], risk_free: f64) -> f64 {
    let n = returns.len() as f64;
    let daily_rf = risk_free / 252.0;

    // Calculate mean excess return
    let mean_excess: f64 = returns.iter()
        .map(|r| r - daily_rf)
        .sum::<f64>() / n;

    // Calculate standard deviation
    let variance: f64 = returns.iter()
        .map(|r| (r - daily_rf - mean_excess).powi(2))
        .sum::<f64>() / (n - 1.0);
    let std_dev = variance.sqrt();

    // Annualize
    (mean_excess / std_dev) * (252.0_f64).sqrt()
}

For max drawdown:

fn calculate_max_drawdown(equity_curve: &[f64]) -> f64 {
    let mut peak = equity_curve[0];
    let mut max_dd = 0.0;

    for &equity in equity_curve.iter() {
        if equity > peak {
            peak = equity;
        }
        let drawdown = (peak - equity) / peak;
        if drawdown > max_dd {
            max_dd = drawdown;
        }
    }
    max_dd
}
Hint 4 - Performance Optimization

When correctness is verified, optimize:

Data loading:

// Use memory-mapped files for large datasets
use memmap2::Mmap;

fn load_mmap(path: &str) -> Result<Mmap> {
    let file = File::open(path)?;
    unsafe { Mmap::map(&file) }
}

// Parse without copying
fn parse_tick_zero_copy(bytes: &[u8]) -> TickEvent {
    // Read directly from byte slice
    // Assumes fixed-size binary format
    TickEvent {
        timestamp: i64::from_le_bytes(bytes[0..8].try_into().unwrap()),
        // ...
    }
}

Event iteration:

// Use streaming instead of loading all at once
struct EventIterator {
    reader: BufReader<File>,
    buffer: Vec<u8>,
}

impl Iterator for EventIterator {
    type Item = TickEvent;

    fn next(&mut self) -> Option<Self::Item> {
        // Read one record at a time
        // Parse and return
    }
}

Avoid allocations in hot path:

// Pre-allocate the trade log
let mut trades: Vec<Trade> = Vec::with_capacity(10000);

// Use fixed-size arrays for symbols instead of String
type Symbol = [u8; 8];

5.8 The Interview Questions They’ll Ask

After completing this project, you should be able to answer:

  1. “How do you prevent lookahead bias in a backtest?”

    Expected: Explain that the strategy only receives data up to the current timestamp. Orders are filled on future ticks, not the current one. The design enforces temporal ordering.

  2. “What is slippage and how do you model it?”

    Expected: Define slippage as difference between expected and actual price. Explain components: spread crossing, market impact, latency. Describe your model (fixed percentage, volume-based, or order book).

  3. “How do you calculate Sharpe ratio and what are its limitations?”

    Expected: Give the formula. Explain annualization. Discuss limitations: assumes normal distribution, penalizes upside volatility, sensitive to measurement period.

  4. “What is overfitting in the context of backtesting?”

    Expected: Explain that optimizing parameters on historical data finds patterns that don’t generalize. Describe walk-forward testing, out-of-sample validation, and the difference between in-sample and out-of-sample performance.

  5. “How would you handle a strategy that trades multiple symbols?”

    Expected: Discuss merging event streams, maintaining separate positions per symbol, calculating portfolio-level metrics, and handling cross-symbol correlations.

  6. “What’s the difference between backtesting and paper trading?”

    Expected: Backtest uses historical data (fast, repeatable, but simulated). Paper trading uses live data with simulated execution (slower, but tests real-time behavior).

  7. “How does your backtester handle partial fills?”

    Expected: Explain that large orders may not fill completely at one price. Describe how you track partially filled orders and update average prices.

  8. “Why might a backtest show profit but live trading show loss?”

    Expected: List reasons: unrealistic fill assumptions, lookahead bias, survivorship bias, regime change, transaction costs, market impact not modeled, and execution latency.

5.9 Books That Will Help

Topic Book Chapter
Trading concepts “Trading and Exchanges” by Larry Harris Ch. 1-5, 11-13
Quantitative methods “Active Portfolio Management” by Grinold & Kahn Ch. 2-4, 14
Backtesting “Algorithmic Trading” by Ernest Chan Ch. 1-4
Strategy development “Quantitative Trading” by Ernest Chan Ch. 1-5
Risk management “Risk Management and Financial Institutions” by John Hull Ch. 1-3
Performance metrics “The Quants” by Scott Patterson (General context)
Systems design “Designing Data-Intensive Applications” by Kleppmann Ch. 1-3
Statistical methods “Advances in Financial Machine Learning” by Marcos Lopez de Prado Ch. 8-11

5.10 Implementation Phases

Phase 1: Basic Framework (Days 1-4)

Goals:

  • Load tick data from CSV
  • Implement basic event loop
  • Create strategy interface
  • Simple position tracking

Tasks:

  1. Define data structures (Tick, Bar, Order, Fill, Position)
  2. Implement CSV parser for tick data
  3. Create event loop that iterates through sorted events
  4. Define Strategy trait with on_tick callback
  5. Implement simple position tracker
  6. Create a trivial “buy and hold” strategy for testing

Checkpoint: Can load data, run a strategy, and see position changes.

Phase 2: Order Management and Fills (Days 5-8)

Goals:

  • Order lifecycle management
  • Fill simulation with slippage
  • P&L calculation

Tasks:

  1. Implement Order struct with status tracking
  2. Implement Order Manager (pending orders, filled orders)
  3. Create Fill Simulator with configurable slippage
  4. Implement market order fills (at next tick)
  5. Implement limit order fills (when price is hit)
  6. Create P&L calculator (per trade and cumulative)
  7. Test with known scenarios (compare to hand calculation)

Checkpoint: Orders get filled realistically, P&L is calculated correctly.

Phase 3: Metrics and Analysis (Days 9-12)

Goals:

  • Performance metrics calculation
  • Equity curve generation
  • Trade logging

Tasks:

  1. Implement daily return calculation
  2. Implement Sharpe ratio calculation
  3. Implement max drawdown calculation
  4. Add Sortino ratio and profit factor
  5. Generate equity curve data
  6. Create trade log (CSV output)
  7. Create summary report (terminal output)

Checkpoint: Full performance report is generated with accurate metrics.

Phase 4: Optimization and Extensions (Days 13-18)

Goals:

  • Handle large datasets efficiently
  • Support multiple symbols
  • Multiple strategies comparison

Tasks:

  1. Implement memory-efficient data loading (streaming or mmap)
  2. Benchmark with large datasets (1M+ ticks)
  3. Add multi-symbol support
  4. Implement strategy comparison mode
  5. Add latency simulation
  6. Create example strategies (momentum, mean reversion)
  7. Write comprehensive documentation

Checkpoint: Can process 1M ticks/second, compare multiple strategies.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Price representation f64, fixed-point f64 for simplicity Fixed-point is faster but adds complexity; f64 is fine for backtesting
Event ordering Sort in memory, merge-sort files Sort in memory first Memory is cheap; optimize only if needed
Fill timing Same tick, next tick Next tick More realistic; prevents lookahead
Slippage model Fixed, percentage, volume-based Percentage Good balance of simplicity and realism
Position tracking Per-symbol hashmap HashMap<Symbol, Position> O(1) lookup, handles multiple symbols
Return calculation Per-tick, per-day Per-day Standard for Sharpe; less noisy

6. Testing Strategy

6.1 Unit Tests

Test Category What to Test
Data parsing Correct parsing of tick/bar data, handling of edge cases
Event ordering Events are processed in correct chronological order
Order matching Market and limit orders fill at correct prices
Position tracking Quantity and average price update correctly
P&L calculation Matches hand-calculated examples
Metric calculation Sharpe, drawdown match known values

6.2 Critical Test Cases

Test 1: Buy and Sell - Profit
  Start: $100,000
  BUY 100 @ $50.00 (cost: $5,000)
  Price moves to $55.00
  SELL 100 @ $55.00 (revenue: $5,500)
  Expected P&L: +$500
  Final capital: $100,500

Test 2: Buy and Sell - Loss
  Start: $100,000
  BUY 100 @ $50.00 (cost: $5,000)
  Price moves to $45.00
  SELL 100 @ $45.00 (revenue: $4,500)
  Expected P&L: -$500
  Final capital: $99,500

Test 3: Short Sell - Profit
  Start: $100,000
  SELL 100 @ $50.00 (receive: $5,000)
  Price moves to $45.00
  BUY 100 @ $45.00 (cost: $4,500)
  Expected P&L: +$500

Test 4: Partial Position Close
  BUY 100 @ $50.00
  BUY 100 @ $52.00
  Average price: $51.00
  SELL 50 @ $53.00
  Realized P&L on 50: 50 * ($53 - $51) = +$100
  Remaining position: 150 @ $51.00

Test 5: Slippage
  Market BUY 100 with ask @ $50.00, 0.1% slippage
  Fill price: $50.05
  Slippage cost: $5.00

Test 6: Sharpe Ratio (Known Values)
  Daily returns: [0.01, 0.02, -0.01, 0.015, 0.005]
  Mean: 0.008, StdDev: 0.0113
  Daily Sharpe: 0.708
  Annualized Sharpe: 0.708 * sqrt(252) = 11.24

Test 7: Max Drawdown
  Equity curve: [100, 105, 110, 95, 100, 90, 85, 95]
  Peak: 110
  Trough: 85
  Max Drawdown: (110-85)/110 = 22.7%

6.3 Integration Tests

Test: Complete Backtest Flow
  1. Load 1000 ticks of test data
  2. Run simple moving average crossover strategy
  3. Verify number of trades matches expected
  4. Verify final P&L matches expected (within tolerance)
  5. Verify Sharpe ratio matches expected (within tolerance)

Test: Multi-Symbol Backtest
  1. Load data for AAPL and MSFT
  2. Run strategy that trades both
  3. Verify positions tracked separately
  4. Verify portfolio-level metrics are correct

Test: Reproducibility
  1. Run same backtest twice with same data
  2. Verify identical results (to the penny)

7. Common Pitfalls and Debugging

Problem Symptom Cause Solution Verification
Lookahead bias Strategy is too profitable Using future data for decisions Ensure strategy only sees past data Add assertion that strategy timestamp <= event timestamp
Survivorship bias Only tested on stocks that exist today Didn’t include delisted stocks Use point-in-time data Verify data includes stocks that were later delisted
Off-by-one in returns Sharpe is wrong Daily returns calculated incorrectly Use (P1-P0)/P0, not P1/P0-1 Compare to known examples
Wrong fill price P&L seems off Using mid instead of bid/ask Buy at ask, sell at bid Log fill prices, verify against market data
Floating-point accumulation P&L drifts over time Summing small floats Use Kahan summation or track separately Compare forward vs backward calculation
Timezone issues Events out of order Mixed UTC and local times Normalize all timestamps to UTC Print timestamps, verify order
Missing data handling Strategy crashes Gap in data not handled Skip or interpolate Test with data containing gaps
Position sign error Short positions show wrong P&L Not handling negative quantities Long = positive, short = negative Test short selling explicitly

7.1 Debugging Strategies

Add Verbose Logging:

// Log every significant event
fn on_fill(&mut self, fill: &Fill) {
    log::debug!(
        "FILL: {} {} {} @ {} | Position: {} @ {} | Realized P&L: {}",
        fill.side, fill.quantity, fill.symbol, fill.price,
        self.position.quantity, self.position.avg_price,
        self.position.realized_pnl
    );
}

Compare to Manual Calculation:

Create a small dataset (10 events) and manually calculate what should happen. Compare against backtester output step by step.

Visualization:

Plot the equity curve. Obvious errors (like P&L suddenly jumping) are easier to spot visually than in numbers.

Good equity curve:        Bad equity curve (bug):
   ^                         ^
   |    __/\                 |        /|
   |   /    \__/\           |       / |  <- Sudden jump
   |  /         \           | _____|  |
   | /           \__        |/        |
   |/                       |         |
   +----------------->      +----------------->

8. Extensions and Challenges

8.1 Basic Extensions

  • Add stop-loss orders: Trigger sell when price drops below threshold
  • Add take-profit orders: Trigger sell when price reaches target
  • Commission modeling: Add per-trade or per-share commissions
  • Multiple timeframes: Support both tick and bar-level strategies
  • ASCII chart output: Display P&L curve in terminal

8.2 Intermediate Extensions

  • Walk-forward analysis: Train on period 1, test on period 2, roll forward
  • Monte Carlo simulation: Randomize order of trades to estimate variance
  • Optimization engine: Grid search or genetic algorithm for parameter tuning
  • Risk metrics: Value-at-Risk (VaR), Expected Shortfall
  • Transaction cost analysis (TCA): Detailed breakdown of execution costs

8.3 Advanced Extensions

  • Order book simulation: Use L2 data for realistic fills
  • Multi-asset portfolio: Rebalancing, correlation analysis
  • Live trading bridge: Same strategy interface for paper/live trading
  • Strategy templating: DSL for defining strategies without code
  • Distributed backtesting: Run parameter sweeps across multiple machines

9. Real-World Connections

9.1 How Production Systems Differ

Aspect This Project Production System
Data volume GB TB-PB
Fill simulation Simple slippage Full order book replay
Latency modeling Optional Critical
Corporate actions Not handled Full adjustment for splits, dividends
Universe management Fixed symbols Dynamic with survivorship-bias-free data
Execution Simulated Real broker API
Monitoring None Continuous performance monitoring

9.2 Industry Tools

Tool Type Use Case
QuantConnect Cloud platform Free backtesting with data included
Backtrader Python library Popular open-source backtester
Zipline Python library Originally from Quantopian
Lean C# engine QuantConnect’s open-source engine
Vectorbt Python library Vectorized backtesting (fast)
Institutional Custom C++/Java Bloomberg AIM, Fidessa, etc.

9.3 Data Providers

Provider Data Type Cost
Yahoo Finance Daily bars Free
Alpha Vantage Intraday bars Free tier available
Polygon.io Ticks, bars, options Paid
Quandl/Nasdaq Various Paid
IEX Cloud Delayed quotes Free tier available
Institutional Full tick, L2/L3 Very expensive

10. Resources

10.1 Essential Reading

  • “Algorithmic Trading” by Ernest Chan - Practical backtesting and strategy development
  • “Trading and Exchanges” by Larry Harris - Market microstructure bible
  • “Advances in Financial Machine Learning” by Marcos Lopez de Prado - Modern backtesting pitfalls

10.2 Articles and Papers

  • “The Probability of Backtest Overfitting” (Lopez de Prado & Bailey)
  • “Quantitative Trading: How to Build Your Own Algorithmic Trading Business” - Free draft chapters
  • “Zipline: Pythonic Algorithmic Trading Library” - Design document

10.3 GitHub Repositories

10.4 Videos and Courses

  • QuantConnect YouTube channel - Lean engine tutorials
  • Quantopian archived lectures (still available online)
  • Ernest Chan’s courses on Quantra

11. Self-Assessment Checklist

Understanding

  • I can explain the difference between backtesting and paper trading
  • I can describe at least 3 sources of backtest bias
  • I know how to calculate Sharpe ratio and what it means
  • I understand why slippage matters even for long-term strategies
  • I can explain the event-driven architecture of a backtester
  • I know why limit orders might not fill even when the price is hit

Implementation

  • My backtester loads data and replays events in correct order
  • Orders are filled with realistic slippage
  • P&L calculation matches manual verification
  • Sharpe ratio and max drawdown are calculated correctly
  • I can run multiple strategies and compare them
  • Trade logs are generated with all execution details

Growth

  • I’ve tested my backtester on at least 1 year of data
  • I’ve compared results to a known reference (e.g., simple moving average)
  • I understand the limitations of my fill simulation
  • I could extend this to connect to a live trading API

12. Submission / Completion Criteria

Minimum Viable Completion

  • Load tick data from CSV file
  • Run a simple strategy (e.g., buy-and-hold or MA crossover)
  • Calculate P&L and display final result
  • Generate basic trade log
  • Code compiles without warnings

Full Completion

  • Support both market and limit orders
  • Implement configurable slippage model
  • Calculate Sharpe ratio, max drawdown, and profit factor
  • Handle multiple symbols
  • Process 1M+ ticks in under 10 seconds
  • Generate comprehensive performance report
  • All unit tests pass

Excellence (Going Above and Beyond)

  • Walk-forward validation implemented
  • Parameter optimization with out-of-sample testing
  • Memory-efficient streaming for multi-GB datasets
  • Comparison mode for multiple strategies
  • ASCII equity curve visualization
  • Documentation of design decisions and performance analysis

Summary

You have now built one of the most important tools in quantitative finance: a trading strategy backtester.

Along the way, you have mastered:

  1. Event-driven architecture - Processing time-ordered events with strategy callbacks
  2. Market microstructure - Understanding how orders actually get filled
  3. Fill simulation - Modeling slippage, latency, and partial fills
  4. Performance metrics - Calculating Sharpe ratio, drawdown, and profit factor
  5. Quantitative skepticism - Understanding why backtests lie and how to be less fooled

This knowledge directly translates to:

  • Quant trading roles
  • Fintech development
  • Risk management
  • Data science in finance
  • Algorithmic trading startups

The backtester you built can process millions of events and generate production-quality performance reports. More importantly, you now understand the limitations of backtesting - which is the most valuable skill a quantitative developer can have.

Every profitable strategy started with a backtest. Now you have the tool to find yours.

Go build something profitable.


Estimated completion time: 2-3 weeks Lines of code: 1500-3000 (depending on feature completeness) Next step: Paper trading with the same strategy interface