ABtesting.tools

How long should you run your A/B test?

Ending a test too early is one of the most common mistakes in experimentation. This guide explains how to determine the right duration and why patience pays off.

Duration depends on sample size and traffic

Test duration is fundamentally a function of two things: how many visitors you need (sample size) and how many visitors you get per day. The Duration Calculator computes this for you, but understanding the inputs helps you plan better.

The formula is straightforward: days = required sample size รท daily visitors per variant. But there are important nuances beyond this simple division.

Always run for complete weekly cycles

User behavior varies dramatically throughout the week. Monday shoppers behave differently from Saturday browsers. B2B traffic drops on weekends. Promotional emails spike on specific days.

If your test runs for 10 days, it captures one full week plus three extra days โ€” those three days are overrepresented in your data, biasing results. The solution is simple: always round up to complete weeks (7, 14, 21, 28 days, etc.).

This ensures each day of the week is equally represented, eliminating day-of-week confounds from your results.

What affects test duration?

  • Daily traffic volume โ€” more visitors means faster data collection. If you only get 100 visitors per day, even a simple test may take weeks.
  • Baseline conversion rate โ€” lower baselines need more data. A 0.5% conversion rate test takes much longer than a 15% conversion rate test.
  • Minimum detectable effect โ€” trying to detect smaller changes takes exponentially longer. A 2% relative MDE needs roughly 25x more data than a 10% MDE.
  • Number of variants โ€” each additional variant requires its own share of traffic. A 4-variant test takes roughly 3x longer than an A/B test.
  • Traffic allocation โ€” if only 50% of visitors enter the experiment, duration doubles. Account for any holdouts or targeting filters.

Common duration mistakes

  • Stopping at significance โ€” checking daily and stopping the moment p < 0.05 dramatically inflates false positives. Commit to a fixed duration before starting, or use sequential testing.
  • Running too short โ€” a test that runs for 3 days captures only half a weekly cycle. Even if it reaches statistical significance, the results may not generalize to a full week of traffic.
  • Running too long โ€” tests that run for months accumulate external confounds: seasonality shifts, product changes, marketing campaigns. Keep tests under 4โ€“6 weeks when possible.
  • Ignoring holidays and events โ€” Black Friday traffic is not representative of normal behavior. Avoid starting or ending tests around major events unless you are specifically testing for that context.

Practical recommendations

  • Minimum duration: 2 full weeks (14 days) to capture two complete weekly cycles.
  • Maximum recommended: 4โ€“6 weeks to avoid external confounds.
  • If your calculator says the test needs more than 6 weeks, consider increasing MDE or focusing on higher-traffic pages.
  • Always pre-register your end date before launching. This prevents data-peeking temptation.

Calculate your test duration

Use the Duration Calculator to get an exact estimate based on your traffic, baseline rate, and desired sensitivity. Pair it with the Sample Size Calculator to understand the relationship between the two.