Type II Error | Surface AI Hub

A Type II error happens when a test concludes "no significant difference" — but a real improvement actually existed. You missed it. In CRO, this means discarding a variant that would have genuinely increased conversions if you'd given the test enough data.

The probability of a Type II error is denoted β. Statistical power is its complement: power = 1 − β.

Type I vs. Type II Errors

	Test declares: no difference	Test declares: significant difference
No real effect	Correct (true negative)	Type I error (false positive)
Real effect exists	Type II error (false negative)	Correct (true positive)

Both errors have costs. Type I errors waste engineering effort shipping changes that don't help. Type II errors cause you to miss real wins.

Common Causes

Underpowered tests — Not enough sample size to detect the effect size you care about. This is the most frequent cause.
Too short a runtime — Stopping the test before reaching the required sample size.
High baseline variability — Metrics with high variance (like revenue) require larger samples than stable metrics (like click rate).
MDE set too conservatively — If you planned to detect a 30% lift but the true effect is 12%, you won't see it.

How to Avoid Type II Errors

Use a sample size calculator before launching — Input your baseline rate, desired MDE, α, and target power (typically 0.80)
Never stop a test early — Let it run to the planned sample size
Choose a realistic MDE — Base it on historical test results and business thresholds
Segment high-variance metrics — For revenue-based metrics, consider running on a sub-segment with more stable behavior

The Practical Cost

A CRO program running underpowered tests accumulates Type II errors silently. You see a long list of "no significant results" and conclude the site is optimized — when in fact you've been running tests that couldn't detect the real improvements you were making.