Project 13: Experimental Design and Causality Lab
Build a causality lab that compares randomized and quasi-experimental estimates.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3: Advanced |
| Time Estimate | 2 weeks |
| Main Programming Language | Python + SQL |
| Alternative Programming Languages | R |
| Coolness Level | Level 5: Pure Magic |
| Business Potential | 3. Service & Support |
| Prerequisites | Projects 9-12 |
| Key Topics | RCT, sampling, confounding, A/B testing, DiD |
1. Learning Objectives
- Design randomized experiments with integrity checks.
- Quantify and explain confounding in observational estimates.
- Implement introductory difference-in-differences.
- Produce causal claims with assumptions and caveats.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Causal Identification Through Design
- Fundamentals: Design quality drives causal credibility.
- Deep Dive into the concept: Randomization balances confounders in expectation; sampling determines generalizability.
2.2 Quasi-Experimental Rescue Paths
- Fundamentals: When randomization is unavailable, structured assumptions can still identify effects.
- Deep Dive into the concept: DiD compares changes, not levels, and depends on pre-trend plausibility.
3. Project Specification
3.1 What You Will Build
A lab that runs RCT analysis, confounded observational comparison, and DiD correction in one pipeline.
3.2 Functional Requirements
- Assignment integrity diagnostics (balance, sample ratio checks).
- Standard A/B treatment effect analysis.
- Observational naive estimator and confounding warning.
- DiD estimator with pre-trend checks.
3.3 Non-Functional Requirements
- Full assumptions log in outputs.
- Decision-ready causal memo.
3.4 Example Usage / Output
$ python causality_lab.py --case pricing_rollout
RCT effect: +1.8pp
Naive observational effect: +3.4pp
DiD effect: +1.6pp
Conclusion: naive estimate confounded upward
3.5 Real World Outcome
You can defend when a claim is causal, correlational, or uncertain, with explicit design rationale.
4. Solution Architecture
Assignment diagnostics -> RCT estimator -> observational baseline -> DiD module -> causal brief
5. Implementation Guide
5.1 Development Environment Setup
pip install numpy pandas scipy statsmodels
5.2 Project Structure
P13/
causality_lab.py
cases/
outputs/
5.3 The Core Question You Are Answering
“Is observed uplift caused by intervention or by confounding structure?”
5.4 Concepts You Must Understand First
- Potential outcomes intuition
- Randomization and balance
- Confounding and selection bias
- DiD and parallel trends
5.5 Questions to Guide Your Design
- What is the right randomization unit?
- Which guardrails detect harmful side effects?
- How will you test pre-trend assumptions?
5.6 Thinking Exercise
For one historical launch, draft both a causal and non-causal interpretation. List missing evidence.
5.7 The Interview Questions They’ll Ask
- Why can randomized groups still look imbalanced in finite samples?
- What is sample ratio mismatch?
- What does parallel trends require?
- Why is observational uplift often biased?
- How do you communicate causal uncertainty?
5.8 Hints in Layers
- Hint 1: Start with assignment diagnostics.
- Hint 2: Add RCT estimator pipeline.
- Hint 3: Add naive observational comparator.
- Hint 4: Add DiD and pre-trend visualization.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Causal inference intro | Mostly Harmless Econometrics | Ch. 2-5 |
| Experimentation at scale | Trustworthy Online Controlled Experiments | selected |
| Program evaluation | causal texts | selected |
6. Testing Strategy
- Synthetic data with known treatment effects.
- Confounding injection tests.
- Pre-trend violation tests for DiD warnings.
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Solution |
|---|---|---|
| Assignment bugs | misleading uplift | sample-ratio diagnostics |
| Post-hoc slicing | unstable claims | pre-registered analysis plan |
| Ignored trend shifts | invalid DiD | strict pre-period checks |
8. Extensions & Challenges
- Add synthetic control variant.
- Add heterogeneity analysis with multiplicity controls.
9. Real-World Connections
- Product experimentation.
- Policy and pricing impact evaluation.
10. Resources
- Mostly Harmless Econometrics
- Trustworthy OCE references
11. Self-Assessment Checklist
- I can explain why design quality is causal quality.
- I can detect and explain confounding.
- I can apply and caveat DiD correctly.
12. Submission / Completion Criteria
Minimum: RCT + observational comparison + assumptions note.
Full: includes DiD with pre-trend diagnostics and final causal memo.