Project 13: Experimental Design and Causality Lab

Build a causality lab that compares randomized and quasi-experimental estimates.

Quick Reference

Attribute Value
Difficulty Level 3: Advanced
Time Estimate 2 weeks
Main Programming Language Python + SQL
Alternative Programming Languages R
Coolness Level Level 5: Pure Magic
Business Potential 3. Service & Support
Prerequisites Projects 9-12
Key Topics RCT, sampling, confounding, A/B testing, DiD

1. Learning Objectives

  1. Design randomized experiments with integrity checks.
  2. Quantify and explain confounding in observational estimates.
  3. Implement introductory difference-in-differences.
  4. Produce causal claims with assumptions and caveats.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Causal Identification Through Design

  • Fundamentals: Design quality drives causal credibility.
  • Deep Dive into the concept: Randomization balances confounders in expectation; sampling determines generalizability.

2.2 Quasi-Experimental Rescue Paths

  • Fundamentals: When randomization is unavailable, structured assumptions can still identify effects.
  • Deep Dive into the concept: DiD compares changes, not levels, and depends on pre-trend plausibility.

3. Project Specification

3.1 What You Will Build

A lab that runs RCT analysis, confounded observational comparison, and DiD correction in one pipeline.

3.2 Functional Requirements

  1. Assignment integrity diagnostics (balance, sample ratio checks).
  2. Standard A/B treatment effect analysis.
  3. Observational naive estimator and confounding warning.
  4. DiD estimator with pre-trend checks.

3.3 Non-Functional Requirements

  • Full assumptions log in outputs.
  • Decision-ready causal memo.

3.4 Example Usage / Output

$ python causality_lab.py --case pricing_rollout
RCT effect: +1.8pp
Naive observational effect: +3.4pp
DiD effect: +1.6pp
Conclusion: naive estimate confounded upward

3.5 Real World Outcome

You can defend when a claim is causal, correlational, or uncertain, with explicit design rationale.


4. Solution Architecture

Assignment diagnostics -> RCT estimator -> observational baseline -> DiD module -> causal brief

5. Implementation Guide

5.1 Development Environment Setup

pip install numpy pandas scipy statsmodels

5.2 Project Structure

P13/
  causality_lab.py
  cases/
  outputs/

5.3 The Core Question You Are Answering

“Is observed uplift caused by intervention or by confounding structure?”

5.4 Concepts You Must Understand First

  1. Potential outcomes intuition
  2. Randomization and balance
  3. Confounding and selection bias
  4. DiD and parallel trends

5.5 Questions to Guide Your Design

  1. What is the right randomization unit?
  2. Which guardrails detect harmful side effects?
  3. How will you test pre-trend assumptions?

5.6 Thinking Exercise

For one historical launch, draft both a causal and non-causal interpretation. List missing evidence.

5.7 The Interview Questions They’ll Ask

  1. Why can randomized groups still look imbalanced in finite samples?
  2. What is sample ratio mismatch?
  3. What does parallel trends require?
  4. Why is observational uplift often biased?
  5. How do you communicate causal uncertainty?

5.8 Hints in Layers

  • Hint 1: Start with assignment diagnostics.
  • Hint 2: Add RCT estimator pipeline.
  • Hint 3: Add naive observational comparator.
  • Hint 4: Add DiD and pre-trend visualization.

5.9 Books That Will Help

Topic Book Chapter
Causal inference intro Mostly Harmless Econometrics Ch. 2-5
Experimentation at scale Trustworthy Online Controlled Experiments selected
Program evaluation causal texts selected

6. Testing Strategy

  • Synthetic data with known treatment effects.
  • Confounding injection tests.
  • Pre-trend violation tests for DiD warnings.

7. Common Pitfalls & Debugging

Pitfall Symptom Solution
Assignment bugs misleading uplift sample-ratio diagnostics
Post-hoc slicing unstable claims pre-registered analysis plan
Ignored trend shifts invalid DiD strict pre-period checks

8. Extensions & Challenges

  • Add synthetic control variant.
  • Add heterogeneity analysis with multiplicity controls.

9. Real-World Connections

  • Product experimentation.
  • Policy and pricing impact evaluation.

10. Resources

  • Mostly Harmless Econometrics
  • Trustworthy OCE references

11. Self-Assessment Checklist

  • I can explain why design quality is causal quality.
  • I can detect and explain confounding.
  • I can apply and caveat DiD correctly.

12. Submission / Completion Criteria

Minimum: RCT + observational comparison + assumptions note.

Full: includes DiD with pre-trend diagnostics and final causal memo.