A Type I error occurs when a test declares a winner that isn't actually better. You observed a difference in the data, but that difference was due to random noise — not a real effect of your change.
In A/B testing, a Type I error means you ship a "winning" variant that produces no improvement (or actively hurts performance) in production.
The Significance Level Controls Type I Error Rate
The significance level (α) is the maximum acceptable probability of a Type I error. At α = 0.05, you accept a 5% chance of falsely declaring a winner on any given test.
| Significance level | Type I error rate | Confidence level |
|---|---|---|
| α = 0.10 | 10% | 90% |
| α = 0.05 | 5% | 95% |
| α = 0.01 | 1% | 99% |
The stricter your threshold, the larger the sample you need to cross it.
What Inflates Type I Errors
Several common practices push the real false positive rate above the nominal α:
- Peeking — Stopping a test early when it looks significant inflates false positives dramatically. A test checked daily at α = 0.05 can have a true false positive rate above 25%.
- Multiple metrics — Testing 10 metrics simultaneously at α = 0.05 gives roughly a 40% chance of at least one false positive (use Bonferroni correction or a primary metric).
- Multiple variants — Each pairwise comparison between variants adds a chance of a false positive (use a multiplicity correction or hold α constant across all comparisons).
Reducing Type I Errors
- Pre-register your primary metric before launching
- Run the test for its planned duration without early stopping
- Use a multiplicity correction when testing several metrics
- Raise your confidence threshold (e.g., α = 0.01) for high-stakes decisions
Type I errors are the most common mistake in CRO programs. The cure is discipline: commit to a runtime, commit to a primary metric, and don't peek.