Project 5: The Monte Carlo Casino (Probability, Statistics, and Decision Math)

Build a deterministic casino simulation suite that estimates probabilities, expected value, and house edge from repeated random trials.

Quick Reference

Attribute	Value
Difficulty	Level 2 (Beginner-Intermediate)
Time Estimate	8-16 hours
Main Language	Python
Alternatives	R, Julia, JavaScript
Key Topics	Random variables, law of large numbers, expected value, variance, confidence intervals
Deliverable	CLI simulator for dice/roulette-style games with statistical reports
Observable Outcome	Simulated frequencies converge toward theoretical probabilities with increasing trials

Learning Objectives

By the end of this project, you should be able to:

Model games of chance as random experiments with measurable outcomes.
Compute empirical probabilities from simulation data.
Compare theoretical probabilities with Monte Carlo estimates.
Estimate expected value (EV) per round and cumulative bankroll drift.
Explain why convergence improves with trial count but never becomes exact.
Use deterministic seeds for reproducible testing and debugging.
Interpret variance and confidence intervals for simulation reliability.
Connect probability math to real house-edge economics.

Theory Needed

1) Random Experiments and Outcome Spaces

A simulation starts with clear outcome definitions. For a dice game, an outcome is a tuple (die1, die2). For roulette, it is the landed pocket. Your estimator quality depends on precise event definition.

Core terms:

Sample space: all possible outcomes.
Event: subset of outcomes (e.g., sum == 7).
Indicator variable: 1 if event happened, else 0.

ASCII model:

Single trial
   |
   v
Generate random outcome
   |
   +--> event happened? yes -> indicator=1
   |
   +--> no              -> indicator=0
   |
   v
Accumulate indicators across N trials
   |
   v
Estimated probability = (sum of indicators) / N

2) Law of Large Numbers (LLN)

LLN says sample averages converge toward expected values as number of independent trials grows. This is why Monte Carlo works: each trial may be noisy, but aggregate behavior stabilizes.

Important nuance:

Convergence is probabilistic, not guaranteed at finite N.
Short runs can look misleading even with fair games.

Practical takeaway:

always track estimate as a function of trial count
avoid claims from tiny sample sizes

3) Expected Value and House Edge

Expected value per round:

EV = sum_over_outcomes( payoff(outcome) * P(outcome) )

If EV is negative for the player, the game has a house edge. Over many rounds, expected bankroll drift is approximately:

expected_total = rounds * EV_per_round

This is central to casino economics and risk modeling.

4) Variance, Uncertainty, and Confidence

Two strategies can share the same EV but differ in variance. High variance means bigger short-term swings.

For Bernoulli event estimates, standard error approximation:

SE ~= sqrt( p_hat * (1 - p_hat) / N )

Approximate 95% interval:

p_hat +/- 1.96 * SE

This lets you communicate uncertainty instead of only point estimates.

5) Deterministic Randomness in Engineering

Simulation needs pseudo-random numbers, not true randomness. A fixed seed guarantees reproducible sequences, making test failures debuggable.

Pseudo-logic:

seed_rng(2026)
for trial in 1..N:
  outcome <- game.sample(rng)
  payout <- game.payout(outcome, bet)
  update_stats(outcome, payout)
report(stats)

Project Spec

What You Will Build

A CLI simulator with at least two game modules:

Two-dice sum game (event probability calibration).
Simplified roulette bet (e.g., single number or red/black style payout).

Required outputs:

empirical event probabilities
theoretical references (when known)
EV per round estimate
bankroll trajectory summary
confidence interval for selected events

Out of scope:

real-money integration
advanced betting systems as “winning strategy” claims
graphical dashboards

Functional Requirements

Accept config: game type, trials, seed, bet size, initial bankroll.
Run deterministic simulations from fixed seed.
Report both per-event and aggregate metrics.
Support batch mode for multiple trial counts (e.g., 1k, 10k, 100k).
Print machine-checkable summary lines (for regression tests).

Non-Functional Requirements

Reproducibility: same seed => same report.
Performance: 1e6 simple trials should complete in practical time.
Clarity: output should separate theoretical vs simulated values.

Data Shapes (Conceptual)

GameResult:
  outcome_label: string
  payout_units: int or float

SimulationStats:
  trials: int
  hits_by_event: map[string -> int]
  p_hat_by_event: map[string -> float]
  ev_per_round: float
  bankroll_final: float
  ci_95_by_event: map[string -> (low, high)]

Real World Outcome

You should be able to run the following deterministic command and verify every reported statistic format and trend.

Deterministic CLI transcript:

$ python monte_carlo_casino.py \
  --game dice_sum \
  --event sum=7 \
  --trials 120000 \
  --seed 2026 \
  --bet 1 \
  --initial-bankroll 1000

[INFO] seed=2026 game=dice_sum event=sum=7
[INFO] trials=120000
[THEORY] P(sum=7)=0.166667
[SIM] hits=19949 p_hat=0.166242
[SIM] abs_error=0.000425
[SIM] 95pct_CI=[0.164133, 0.168351]
[EV] per_round=-0.002516 units
[BANKROLL] start=1000.00 end=698.10 delta=-301.90
[CHECKSUM] run_id=dice_sum|120000|2026|sum=7|f2e1b3a9

And for a roulette-style module:

$ python monte_carlo_casino.py --game roulette_single --number 17 --trials 200000 --seed 2026
[THEORY] P(hit)=0.027027
[SIM] hits=5428 p_hat=0.027140
[EV] per_round=-0.052960 units

Success signal: empirical estimates move toward theory as trial count increases; EV remains negative for fair casino parameters.

Solution Architecture

High-level flow:

+--------------------+
| CLI Config Parser  |
+---------+----------+
          |
          v
+--------------------+      +---------------------+
| Game Registry      |----->| Selected Game Model |
| (dice/roulette/...)|      | sample() + payout() |
+---------+----------+      +----------+----------+
          |                            |
          v                            v
+--------------------+      +---------------------+
| Simulation Engine  |----->| Statistics Aggreg.  |
| (seeded RNG loop)  |      | probs, EV, CI, bank |
+---------+----------+      +----------+----------+
          |                            |
          +-------------> +------------+
                          v
                    +-------------+
                    | Reporter     |
                    +-------------+

Key Components

Component	Responsibility	Design Decision
`GameModel`	Defines outcome sampling and payout rules	Interface per game avoids branching chaos
`Simulator`	Runs trial loop with seeded RNG	Keeps loop generic and reusable
`StatsAccumulator`	Maintains counts, sums, variance terms	Single pass for performance
`TheoryModule`	Supplies known exact probabilities/EV	Enables direct comparison and sanity checks

Algorithm Overview

Pseudo-implementation:

function simulate(game, cfg):
  rng <- seeded_rng(cfg.seed)
  stats <- init_stats(cfg)

  for t in 1..cfg.trials:
    result <- game.sample_and_score(rng, cfg)
    stats.total_payout += result.payout_units
    stats.bankroll += result.payout_units
    stats.update_event_counts(result)

  stats.compute_probabilities()
  stats.compute_ev()
  stats.compute_confidence_intervals()
  return stats

Complexity:

Time: O(trials)
Space: O(number_of_tracked_events)

Tradeoff:

Tracking full bankroll timeline gives richer analysis but higher memory. Consider optional sampled checkpoints instead of every round.

Implementation Guide

Phase 1: Game Definitions

Define one abstract game contract (sample_and_score).
Implement dice_sum module first.
Add theoretical formulas for simple sanity checks.

Checkpoint: one deterministic dice simulation outputs valid p_hat for sum=7.

Phase 2: Simulation Core

Build seeded RNG wrapper.
Implement trial loop and counter updates.
Record aggregate payout and bankroll delta.

Checkpoint: same seed and config produces identical hits and checksum.

Phase 3: Statistics and Reporting

Compute p_hat, absolute error, EV.
Add confidence intervals.
Print transcript-friendly summary fields.

Checkpoint: output format stable enough for text snapshot tests.

Phase 4: Second Game Module

Implement roulette-like module.
Define payout schedule and house edge assumptions clearly.
Add cross-game comparison mode.

Checkpoint: both modules produce coherent EV signs and convergence behavior.

Testing

Deterministic Regression

Fixed seed snapshot test for hits, p_hat, and checksum.
Output-format contract test to protect downstream parsers.

Statistical Sanity Tests

For increasing trial counts, average absolute error should trend downward.
Estimated probability should stay inside broad confidence bounds most runs.

Edge Case Tests

trials=0 rejected with clear error.
Invalid event spec rejected.
Negative bankroll inputs handled by policy (reject or allow, but consistent).

Deterministic test transcript example:

$ pytest tests/test_monte_carlo_casino.py -q
......
6 passed in 0.27s

$ python monte_carlo_casino.py --game dice_sum --event sum=7 --trials 10000 --seed 77
[SIM] hits=1694 p_hat=0.169400
[CHECKSUM] run_id=dice_sum|10000|77|sum=7|9b3b0c11

Pitfalls

Symptom	Likely Cause	Fix
Results change across identical runs	Seed not applied consistently	Instantiate RNG exactly once and inject everywhere
EV sign is wrong	Payout convention mismatch (profit vs return)	Standardize payout as net gain/loss per round
“Convergence” looks bad	Too few trials or cherry-picked run	Run logarithmic trial ladder (1k/10k/100k/1M)
CI values unrealistic	Wrong SE formula or event denominator	Re-check Bernoulli assumptions and `N`
Bankroll jumps strangely	Double-counting bet placement	Separate stake accounting from payout accounting

Extensions

Add blackjack-like simplified game state model.
Add batch experiments across many seeds and summarize variance.
Plot convergence curves for multiple events in one run.
Add risk metrics: max drawdown and ruin probability.
Explore biased RNG to demonstrate why PRNG quality matters.

Real-World Connections

Quant finance: Monte Carlo pricing and risk stress testing.
Reliability engineering: rare-event failure probability estimation.
A/B testing intuition: uncertainty and confidence intervals in experimentation.
Gaming analytics: payout design, retention economics, and fairness diagnostics.

Resources

Blitzstein and Hwang, Introduction to Probability.
Sheldon Ross, A First Course in Probability.
Allen B. Downey, Think Stats (practical simulation mindset).
Kroese et al., Handbook of Monte Carlo Methods.

Self-Assessment

Why can two runs with the same N still differ substantially?
What exact condition makes EV negative for the player?
Why is fixed-seed determinism essential for engineering quality?
How do you explain the difference between variance and expected value?
What does a 95% confidence interval actually mean in repeated sampling terms?
If your simulation beats theory consistently, what bugs should you suspect first?
How would you redesign output to make CI and EV easier to compare across games?
Why do casinos prefer many small repeated bets over few large bets?

Completion Criteria

At least two game modules implemented with shared simulation engine.
Deterministic seed-based transcript reproduces exactly for golden config.
Report includes p_hat, theoretical reference, EV, and CI.
Regression tests pass and include output snapshot checks.
Trial ladder demonstrates convergence trend in documented results.
You can explain house edge mathematically, not just intuitively.