Ran a test? Find out whether the winner is real or just noise. Enter the visitors and conversions for each variant, and we'll tell you the conversion rates, the uplift, and how confident you can be.
We design, run and read A/B tests that actually move CPA and conversion — methodically, every month.
When you run an A/B test, one variant almost always "wins" by some margin. Statistical significance answers the only question that matters: is that difference real, or could it have happened by random chance? A result is conventionally called significant when there's a 95% or greater chance the difference is genuine — meaning less than a 5% probability it's a fluke.
This calculator runs a two-tailed two-proportion z-test — the standard method for comparing two conversion rates — and translates the result into plain language: your confidence level, the p-value, and a clear verdict.
How sure you can be that the winning variant is genuinely better. 95%+ is the common bar; 99%+ is very strong.
The probability of seeing a difference this large if the two variants were actually identical. Lower is better — under 0.05 means significant at 95%.
How much better the variation converts versus the control, in percentage terms. A jump from 10% to 13% is a 30% relative uplift.
Significance isn't the same as "enough data." Small samples can show big swings that vanish with more traffic, and calling a test too early is one of the most common — and expensive — mistakes in marketing. Let tests run to a planned sample size, and beware of peeking. If you'd like a second pair of (senior) eyes on your testing program, that's literally our favorite thing to do.
95% is the standard in marketing and most science. For high-stakes or hard-to-reverse decisions, wait for 99%. For quick, low-risk iterations some teams accept 90% — just know you're taking on more risk of a false positive.
Usually it means you need more data, or the true difference is small. Keep the test running toward a pre-planned sample size, or test a bolder change that's more likely to move the needle.
This tool compares two variants (A vs B). Multivariate and multi-arm tests need corrections for multiple comparisons — happy to help you set those up properly.
It's the classic frequentist two-proportion z-test that underpins most A/B testing platforms. Some tools layer on Bayesian methods or sequential testing — useful, but the core idea is the same.
Our CEO typically replies within 90 minutes. Talk soon.