Bayesian A/B testing is an alternative to the classical (frequentist) approach. Instead of asking "is there enough evidence to reject the null hypothesis?", it asks "given what we've observed, what is the probability that variant B is better than control A?"
The output is an intuitive probability: "There is an 87% chance variant B beats control."
Frequentist vs. Bayesian
| Frequentist A/B Testing | Bayesian A/B Testing | |
|---|---|---|
| Primary output | p-value, confidence interval | Probability B > A, credible interval |
| Null hypothesis | Required | Not used |
| Prior knowledge | Ignored | Incorporated as prior distribution |
| Peeking | Inflates false positives | More robust to continuous monitoring |
| Interpretability | Counterintuitive | Directly interpretable |
| Sample size | Fixed upfront | Can be flexible |
How It Works
- Set a prior — Encode your belief about the baseline conversion rate before the test starts (often a weak, uninformative prior)
- Collect data — Observe conversions in control and variant
- Update the posterior — Using Bayes' theorem, combine the prior with observed data to get a probability distribution over possible conversion rates
- Read the result — The posterior gives you "probability variant beats control" and an expected lift estimate
When to Use Bayesian Methods
Bayesian testing works well when:
- You need to make decisions continuously rather than at a fixed endpoint
- You want intuitive outputs stakeholders can act on without statistics training
- You have a prior sense of the baseline rate and want to incorporate it
- You're running many tests and want to learn across experiments
Limitations
- Results depend on the choice of prior — a strong incorrect prior can mislead
- Harder to audit and reproduce than a simple p-value calculation
- "Probability B > A" is not the same as "B will beat A in production" — there's still uncertainty in the estimate
- Not universally accepted as a replacement for frequentist methods in regulated industries
Many modern experimentation platforms (Optimizely, VWO, Google Optimize successors) offer Bayesian modes alongside frequentist tests.