A/B testing answers the question: "Which variant performs better?" Incrementality testing answers a harder question: "Does this marketing activity actually cause conversions — or would those conversions have happened anyway?"
That distinction matters more than most teams realize. If you're spending $50,000 a month on retargeting ads and most of those conversions would have occurred organically, you're not generating $50,000 in value — you're paying $50,000 to take credit for conversions your customers would have made regardless.
Incrementality testing is how you find out the truth.
What Incrementality Measures
The core concept is causal lift: the additional conversions that occur specifically because of your marketing activity, above and beyond what would have happened without it.
The formula is straightforward:
Incremental lift = Conversion rate (exposed group) − Conversion rate (holdout group)
The challenge is constructing the holdout group correctly. Unlike an A/B test where you control the experience on your own site, incrementality testing often involves controlling exposure to external channels — paid ads, email campaigns, out-of-home placements — where randomly withholding exposure from users requires coordination with ad platforms.
How Incrementality Tests Work
There are three common methods for running incrementality tests:
User-Level Holdouts
You randomly split your audience into two groups. The treatment group sees your ads normally. The holdout group is excluded from seeing any ad in the campaign — either through platform-level audience exclusions or by showing a PSA (public service announcement) ad as a neutral control.
At the end of the measurement window, you compare conversion rates between the two groups. The difference is your incremental lift.
Best for: Digital advertising channels where you can control audience exposure at the user level (Facebook, Google, programmatic display).
Geo-Based Holdouts
Instead of splitting users, you split geographic markets. You run your campaign in treatment markets and pause it (or reduce it) in holdout markets that are demographically similar. You then compare performance across markets, accounting for baseline differences.
Best for: Channels where user-level holdouts aren't feasible — TV, radio, direct mail, out-of-home, or campaigns where platform targeting makes audience exclusion difficult.
Time-Based Holdouts
You alternate between periods of running a campaign and periods of pausing it, then compare conversion rates across those windows. This works for some channels but is vulnerable to seasonality and other time-based confounders.
Best for: Simple channels with stable demand curves. Use with caution — the validity of the control depends heavily on how stable your baseline is week-to-week.
Incrementality vs. A/B Testing: Key Differences
| Dimension | A/B Testing | Incrementality Testing |
|---|---|---|
| Question answered | Which variant converts better? | Does this activity cause conversions? |
| Control condition | Different experience on-site | No exposure to the campaign |
| What it measures | Relative performance | True causal lift |
| Common use cases | Landing pages, emails, CTAs | Paid ads, channel mix, attribution |
| Statistical basis | Frequentist or Bayesian significance | Confidence interval on lift estimate |
| Risk | Type I errors, peeking | Selection bias in holdout construction |
The key distinction: A/B tests measure which experience is better. Incrementality tests measure whether the activity is worth doing at all.
Why Attribution Models Aren't Enough
Many teams rely on multi-touch attribution models to understand channel effectiveness. The problem is that attribution doesn't measure incrementality — it measures credit.
If a customer sees a Facebook ad, then a Google Search ad, then visits your site directly and converts, attribution models will assign some portion of credit to each touchpoint. But none of that tells you whether any individual channel actually caused the conversion. The customer might have searched and converted without the Facebook ad ever being shown.
Incrementality testing bypasses the attribution problem entirely by measuring actual causal impact through controlled exposure.
When to Run an Incrementality Test
Incrementality tests are most valuable when:
- You're spending significantly on retargeting — Retargeting campaigns are notoriously prone to attribution inflation because they target people who were already considering converting
- You're evaluating a new channel — Before scaling investment in a channel, confirm it's driving incremental volume, not just capturing conversions from other sources
- Attribution data conflicts — When different attribution models give you wildly different answers about channel value, an incrementality test is the tiebreaker
- You're making large budget allocation decisions — The cost of a holdout (slightly lower short-term conversion volume) is worth it before reallocating millions in spend
Limitations and Common Mistakes
Holdout contamination — If your holdout audience sees ads through other campaigns or channels, the control is no longer clean. Run incrementality tests on isolated campaigns or well-segmented audiences.
Small holdout groups — A 5% holdout on a low-volume campaign often lacks the statistical power to detect meaningful lift differences. Size holdouts to ensure you can measure a realistic minimum detectable effect.
Short measurement windows — Many conversions happen days or weeks after first exposure. Cutting off measurement too early understates incremental lift. Match your measurement window to your typical conversion lag.
Ignoring long-term effects — Incrementality tests measure conversions in the measurement window. Brand-building activities may have lift that extends months beyond the test period — incrementality testing tends to undervalue these.
Confusing lift with efficiency — A campaign might show positive incremental lift but still be an inefficient use of budget compared to alternatives. Pair lift measurement with cost-per-incremental-conversion to make budget decisions.
Incrementality in Practice
The practical sequence for most teams:
- Identify your highest-spend channels where incrementality is uncertain
- Design a holdout test for one channel at a time — start with your retargeting campaigns
- Run the test for long enough to measure your typical conversion lag (usually 2–4 weeks minimum)
- Calculate your cost-per-incremental-conversion and compare against your target CAC
- Use the results to inform budget allocation decisions before the next planning cycle
Incrementality testing is one of the most underutilized tools in digital marketing — not because it's technically complex, but because it requires intentionally giving up short-term attributed conversions to measure real impact. The teams that do it consistently make dramatically better budget decisions over time.
For teams running continuous optimization across multiple channels, Surface AI surfaces the data signals that matter — so you're making decisions based on what's actually driving growth, not which channel takes credit for it.