What is Incrementality Testing? (And Why It Matters More Than A/B Tests)

A/B testing answers the question: "Which variant performs better?" Incrementality testing answers a harder question: "Does this marketing activity actually cause conversions — or would those conversions have happened anyway?"

That distinction matters more than most teams realize. If you're spending $50,000 a month on retargeting ads and most of those conversions would have occurred organically, you're not generating $50,000 in value — you're paying $50,000 to take credit for conversions your customers would have made regardless.

Incrementality testing is how you find out the truth.

What Incrementality Measures

The core concept is causal lift: the additional conversions that occur specifically because of your marketing activity, above and beyond what would have happened without it.

The formula is straightforward:

Incremental lift = Conversion rate (exposed group) − Conversion rate (holdout group)

The challenge is constructing the holdout group correctly. Unlike an A/B test where you control the experience on your own site, incrementality testing often involves controlling exposure to external channels — paid ads, email campaigns, out-of-home placements — where randomly withholding exposure from users requires coordination with ad platforms.

How Incrementality Tests Work

There are three common methods for running incrementality tests:

User-Level Holdouts

You randomly split your audience into two groups. The treatment group sees your ads normally. The holdout group is excluded from seeing any ad in the campaign — either through platform-level audience exclusions or by showing a PSA (public service announcement) ad as a neutral control.

At the end of the measurement window, you compare conversion rates between the two groups. The difference is your incremental lift.

Best for: Digital advertising channels where you can control audience exposure at the user level (Facebook, Google, programmatic display).

Geo-Based Holdouts

Instead of splitting users, you split geographic markets. You run your campaign in treatment markets and pause it (or reduce it) in holdout markets that are demographically similar. You then compare performance across markets, accounting for baseline differences.

Best for: Channels where user-level holdouts aren't feasible — TV, radio, direct mail, out-of-home, or campaigns where platform targeting makes audience exclusion difficult.

Time-Based Holdouts

You alternate between periods of running a campaign and periods of pausing it, then compare conversion rates across those windows. This works for some channels but is vulnerable to seasonality and other time-based confounders.

Best for: Simple channels with stable demand curves. Use with caution — the validity of the control depends heavily on how stable your baseline is week-to-week.

Incrementality vs. A/B Testing: Key Differences

Dimension	A/B Testing	Incrementality Testing
Question answered	Which variant converts better?	Does this activity cause conversions?
Control condition	Different experience on-site	No exposure to the campaign
What it measures	Relative performance	True causal lift
Common use cases	Landing pages, emails, CTAs	Paid ads, channel mix, attribution
Statistical basis	Frequentist or Bayesian significance	Confidence interval on lift estimate
Risk	Type I errors, peeking	Selection bias in holdout construction

The key distinction: A/B tests measure which experience is better. Incrementality tests measure whether the activity is worth doing at all.

Why Attribution Models Aren't Enough

Many teams rely on multi-touch attribution models to understand channel effectiveness. The problem is that attribution doesn't measure incrementality — it measures credit.

If a customer sees a Facebook ad, then a Google Search ad, then visits your site directly and converts, attribution models will assign some portion of credit to each touchpoint. But none of that tells you whether any individual channel actually caused the conversion. The customer might have searched and converted without the Facebook ad ever being shown.

Incrementality testing bypasses the attribution problem entirely by measuring actual causal impact through controlled exposure.

When to Run an Incrementality Test

Incrementality tests are most valuable when:

You're spending significantly on retargeting — Retargeting campaigns are notoriously prone to attribution inflation because they target people who were already considering converting
You're evaluating a new channel — Before scaling investment in a channel, confirm it's driving incremental volume, not just capturing conversions from other sources
Attribution data conflicts — When different attribution models give you wildly different answers about channel value, an incrementality test is the tiebreaker
You're making large budget allocation decisions — The cost of a holdout (slightly lower short-term conversion volume) is worth it before reallocating millions in spend

Limitations and Common Mistakes

Holdout contamination — If your holdout audience sees ads through other campaigns or channels, the control is no longer clean. Run incrementality tests on isolated campaigns or well-segmented audiences.

Small holdout groups — A 5% holdout on a low-volume campaign often lacks the statistical power to detect meaningful lift differences. Size holdouts to ensure you can measure a realistic minimum detectable effect.

Short measurement windows — Many conversions happen days or weeks after first exposure. Cutting off measurement too early understates incremental lift. Match your measurement window to your typical conversion lag.

Ignoring long-term effects — Incrementality tests measure conversions in the measurement window. Brand-building activities may have lift that extends months beyond the test period — incrementality testing tends to undervalue these.

Confusing lift with efficiency — A campaign might show positive incremental lift but still be an inefficient use of budget compared to alternatives. Pair lift measurement with cost-per-incremental-conversion to make budget decisions.

Incrementality in Practice

The practical sequence for most teams:

Identify your highest-spend channels where incrementality is uncertain
Design a holdout test for one channel at a time — start with your retargeting campaigns
Run the test for long enough to measure your typical conversion lag (usually 2–4 weeks minimum)
Calculate your cost-per-incremental-conversion and compare against your target CAC
Use the results to inform budget allocation decisions before the next planning cycle

Incrementality testing is one of the most underutilized tools in digital marketing — not because it's technically complex, but because it requires intentionally giving up short-term attributed conversions to measure real impact. The teams that do it consistently make dramatically better budget decisions over time.

For teams running continuous optimization across multiple channels, Surface AI surfaces the data signals that matter — so you're making decisions based on what's actually driving growth, not which channel takes credit for it.