A/B Testing, Explained

Someone on your team changes the signup button from blue to green. Signups go up 8% the next week. Was it the button? Or was it a Tuesday, a marketing email that went out the same day, or nothing at all — just the normal week-to-week wobble your numbers always have? Without a way to isolate the change, you cannot tell. A/B testing is how you isolate it: split your users into groups at random, show each group a different version, and measure which one actually moves the number you care about.

This guide walks through why randomizing matters, how a real test is put together, and the ways teams talk themselves into believing a result that isn't there.

How to read this

Read it in order. Phase 1 is the core idea — why you need a control group at all. Phase 2 is how a real test is structured, including just enough of "statistical significance" to make the phrase mean something. Phase 3 is the part that actually matters day to day: the mistakes that make a fair test dishonest.

The phases

Why you'd randomize at all — the core idea, and why "before vs. after" is a trap.
How a real test is structured — control vs. variant, one metric, sample size, and what "significant" means.
How teams fool themselves — peeking early, metric fishing, and novelty effects.

Phase 1: Why you'd randomize at all →