Why confidence intervals matter in experiments
When running an A/B test, it’s tempting to look only at the p-value and decide if variant B “beats” variant A. But confidence intervals (CIs) provide much richer insight: they tell you the likely range of the effect size and help connect experimental outcomes to real business metrics like revenue.
Confidence intervals in A/B tests
Suppose variant B shows a 5% lift in conversion rate compared to A, with a 95% CI of (2%, 8%). This interval means the true effect is very likely between 2% and 8%.
- If the entire CI is above zero → evidence of a true positive lift.
- If the CI overlaps zero → effect could be positive or negative, uncertainty remains.
- The width of the CI tells you about precision: narrower intervals mean more confidence in the exact size of the effect.
From percentages to revenue impact
A conversion lift or average order value change is only the first step. To tie results to business outcomes, apply the CI bounds to your traffic and revenue base.
- Example: If baseline monthly revenue is $1,000,000 and your CI for lift is (2%, 8%), then the expected revenue impact is between $20,000 and $80,000.
- This range helps stakeholders understand the risk and potential upside instead of just seeing a single “+5%” number.
This translation of CI into dollars makes results actionable for product, finance, and leadership teams.
Why not just p-values?
P-values only tell you whether the result is statistically significant, not how large or meaningful the effect is. Confidence intervals:
- Show the magnitude of the effect.
- Communicate uncertainty directly.
- Allow comparison of business outcomes (e.g., overlapping CIs between two tests).
By focusing on CIs, you avoid false confidence in “winner-takes-all” decisions and gain a fuller picture of expected performance.
Practical tips for interpreting CIs in experiments
- Always report the CI alongside the estimated effect.
- Use visualizations like error bars to make ranges intuitive.
- Frame results in business terms—conversions, revenue, or customer lifetime value.
- For multiple metrics (e.g., CTR, AOV, churn), interpret each CI to assess trade-offs.
Helpful resources
For those designing or analyzing experiments, practical guides are invaluable. A book on A/B testing explains statistical interpretation in depth. To strengthen statistical foundations, a solid statistics textbook or quick study guide is useful for everyday reference.
The bottom line
Confidence intervals bridge the gap between statistical output and business impact. Instead of asking, “Is B better than A?” the better question is, “What is the range of possible improvements, and what does that mean for revenue?” This perspective makes your experiments more informative, actionable, and aligned with real-world decisions.