Project 4: Is This Die Loaded?
Build a hypothesis test to detect bias in dice rolls.
Project Overview
| Attribute | Value |
|---|---|
| Difficulty | Level 2: Intermediate |
| Time Estimate | Weekend |
| Main Language | Python |
| Alternative Languages | R, JavaScript |
| Knowledge Area | Hypothesis testing |
| Tools | Random generator, CSV |
| Main Book | “OpenIntro Statistics” by Diez et al. |
What you’ll build: A test that analyzes dice roll data and decides if the die is fair.
Why it teaches stats: Hypothesis testing is the formal tool for detecting bias.
Core challenges you’ll face:
- Defining null and alternative hypotheses
- Computing a test statistic
- Interpreting p-values
Real World Outcome
You will input roll data and output whether the die is likely fair or biased.
Example Output:
$ python loaded_die.py rolls.csv
Chi-square: 12.4
p-value: 0.054
Conclusion: insufficient evidence of bias
Verification steps:
- Test with simulated fair and biased dice
- Check sensitivity to sample size
The Core Question You’re Answering
“How do you decide if deviations are just randomness or real bias?”
This is the core of hypothesis testing.
Concepts You Must Understand First
Stop and research these before coding:
- Null hypothesis
- What does “fair die” mean statistically?
- Book Reference: “OpenIntro Statistics” Ch. 6
- Chi-square test
- How do you compute chi-square for categorical data?
- Book Reference: “OpenIntro Statistics” Ch. 6
- p-values
- What does a p-value actually represent?
- Book Reference: “OpenIntro Statistics” Ch. 6
Questions to Guide Your Design
- Sample size
- How many rolls are needed to detect small biases?
- How will you handle small counts per face?
- Decision rule
- What significance level will you choose?
- How will you report uncertainty?
Thinking Exercise
Small Sample
If you roll a die 12 times and get 5 sixes, is that evidence of bias?
Questions while working:
- How likely is that under fairness?
- Why do small samples mislead?
The Interview Questions They’ll Ask
Prepare to answer these:
- “What is the null hypothesis?”
- “What is a chi-square test used for?”
- “What does a p-value mean?”
- “How does sample size affect conclusions?”
- “What is a Type I error?”
Hints in Layers
Hint 1: Starting Point Start by counting outcomes for each face.
Hint 2: Next Level Compute expected counts under fairness.
Hint 3: Technical Details Calculate chi-square and compare to a critical value.
Hint 4: Tools/Debugging Simulate fair dice to verify false positive rates.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Hypothesis testing | “OpenIntro Statistics” | Ch. 6 |
| Chi-square | “OpenIntro Statistics” | Ch. 6 |
| p-values | “OpenIntro Statistics” | Ch. 6 |
Implementation Hints
- Use clear labels for observed and expected counts.
- Report both chi-square and p-value.
- Provide an explanation of the decision rule.
Learning Milestones
- First milestone: You can compute chi-square statistics.
- Second milestone: You can interpret p-values correctly.
- Final milestone: You can explain the decision in plain language.