Bayesian A/B Testing

Bayesian A/B testing is an experimental framework that quantifies the probability one variant beats another using prior beliefs updated with observed data, rather than accepting or rejecting a null hypothesis.

Bayesian A/B testing is an alternative to the classical (frequentist) approach. Instead of asking "is there enough evidence to reject the null hypothesis?", it asks "given what we've observed, what is the probability that variant B is better than control A?"

The output is an intuitive probability: "There is an 87% chance variant B beats control."

Frequentist vs. Bayesian

Frequentist A/B TestingBayesian A/B Testing
Primary outputp-value, confidence intervalProbability B > A, credible interval
Null hypothesisRequiredNot used
Prior knowledgeIgnoredIncorporated as prior distribution
PeekingInflates false positivesMore robust to continuous monitoring
InterpretabilityCounterintuitiveDirectly interpretable
Sample sizeFixed upfrontCan be flexible

How It Works

  1. Set a prior — Encode your belief about the baseline conversion rate before the test starts (often a weak, uninformative prior)
  2. Collect data — Observe conversions in control and variant
  3. Update the posterior — Using Bayes' theorem, combine the prior with observed data to get a probability distribution over possible conversion rates
  4. Read the result — The posterior gives you "probability variant beats control" and an expected lift estimate

When to Use Bayesian Methods

Bayesian testing works well when:

  • You need to make decisions continuously rather than at a fixed endpoint
  • You want intuitive outputs stakeholders can act on without statistics training
  • You have a prior sense of the baseline rate and want to incorporate it
  • You're running many tests and want to learn across experiments

Limitations

  • Results depend on the choice of prior — a strong incorrect prior can mislead
  • Harder to audit and reproduce than a simple p-value calculation
  • "Probability B > A" is not the same as "B will beat A in production" — there's still uncertainty in the estimate
  • Not universally accepted as a replacement for frequentist methods in regulated industries

Many modern experimentation platforms (Optimizely, VWO, Google Optimize successors) offer Bayesian modes alongside frequentist tests.