A Type II error happens when a test concludes "no significant difference" — but a real improvement actually existed. You missed it. In CRO, this means discarding a variant that would have genuinely increased conversions if you'd given the test enough data.
The probability of a Type II error is denoted β. Statistical power is its complement: power = 1 − β.
Type I vs. Type II Errors
| Test declares: no difference | Test declares: significant difference | |
|---|---|---|
| No real effect | Correct (true negative) | Type I error (false positive) |
| Real effect exists | Type II error (false negative) | Correct (true positive) |
Both errors have costs. Type I errors waste engineering effort shipping changes that don't help. Type II errors cause you to miss real wins.
Common Causes
- Underpowered tests — Not enough sample size to detect the effect size you care about. This is the most frequent cause.
- Too short a runtime — Stopping the test before reaching the required sample size.
- High baseline variability — Metrics with high variance (like revenue) require larger samples than stable metrics (like click rate).
- MDE set too conservatively — If you planned to detect a 30% lift but the true effect is 12%, you won't see it.
How to Avoid Type II Errors
- Use a sample size calculator before launching — Input your baseline rate, desired MDE, α, and target power (typically 0.80)
- Never stop a test early — Let it run to the planned sample size
- Choose a realistic MDE — Base it on historical test results and business thresholds
- Segment high-variance metrics — For revenue-based metrics, consider running on a sub-segment with more stable behavior
The Practical Cost
A CRO program running underpowered tests accumulates Type II errors silently. You see a long list of "no significant results" and conclude the site is optimized — when in fact you've been running tests that couldn't detect the real improvements you were making.