Statistical Power

Statistical power is the probability that a test correctly detects a real effect when one exists, typically set at 80% for A/B testing.

Statistical power is the probability your test will detect a real difference between control and variant — assuming that difference actually exists. It is the complement of a Type II error (false negative). If power = 80%, there's a 20% chance you'll miss a real effect.

The Power-Sample Size Relationship

Power is set during test planning, before data collection. It determines how large your sample needs to be. Higher power = more traffic required.

Power levelType II error rateCommon use case
70%30%Exploratory tests, high traffic
80%20%Standard CRO practice
90%10%High-stakes tests, revenue-critical
95%5%Medical/regulatory contexts

The industry default is 80% power with a 95% confidence level (α = 0.05). Raising power to 90% roughly doubles the required sample size.

Why Underpowered Tests Are Dangerous

Running underpowered tests creates two problems:

  1. False negatives — Real improvements go undetected, causing you to discard changes that would have helped.
  2. Inflated winner estimates — When an underpowered test does reach significance, the observed lift tends to be an overestimate of the true effect (the "winner's curse"). You ship expecting a 20% lift and see 8% in production.

How to Calculate Power

Before launching, use a sample size calculator with these inputs:

  • Baseline conversion rate — your current metric value
  • Minimum detectable effect (MDE) — the smallest lift worth detecting
  • Significance level (α) — typically 0.05
  • Desired power (1 − β) — typically 0.80

The calculator outputs the sample size per variant needed. Divide by your daily traffic volume to estimate runtime.

Power in Practice

If your page receives 500 conversions per day and the calculator says you need 10,000 conversions per variant, expect a 20-day minimum runtime — and that's before accounting for day-of-week effects. Running shorter doesn't increase power; it reduces it.