Bandit vs A/B Test Simulator

Watch a multi-armed bandit beat a 50/50 split — live.

An A/B test splits traffic evenly the entire run, even after a clear loser emerges. A multi-armed bandit shifts traffic toward winners as evidence accumulates — exploring less the more confident it gets. Thompson sampling does this by drawing from each variant's Beta posterior, naturally exploring uncertain options and exploiting confident ones. Set the “true” click-through rates below and watch both strategies play out on the same traffic.

Variants

The bandit and A/B test don't know these — only you do.

% CTR

D Best

% CTR

Total impressions

Revenue per click ($)

A/B test

10,000 impressions

Equal 25/25/25/25 split, all the way through

Variant

Impr.

Clicks

CTR

Headline A

25.0% of traffic

2,500

1.88%

Headline B

25.0% of traffic

2,500

2.92%

Headline C

25.0% of traffic

2,500

3.60%

Headline D

25.0% of traffic

2,500

126

5.04%

Total clicks

336

Overall CTR

3.36%

$ wasted

$164

Thompson sampling bandit

10,000 impressions

Shifts traffic toward variants the data favors

Variant

Impr.

Clicks

CTR

Headline A

3.4% of traffic

339

2.36%

Headline B

2.8% of traffic

276

1.81%

Headline C

29.9% of traffic

2,990

135

4.52%

Headline D

63.9% of traffic

6,395

323

5.05%

Total clicks

471

Overall CTR

4.71%

$ wasted

$29

A/B test allocation over time

% of impressions sent to each variant, over the run

A · Headline A

B · Headline B

C · Headline C

D · Headline D

Bandit allocation over time

% of impressions sent to each variant, over the run

A · Headline A

B · Headline B

C · Headline C

D · Headline D

Cumulative budget wasted ($)

A/B test

Bandit

Money “left on the table” vs always serving the optimal variant. Lower is better.

Verdict

The bandit saved you $135 (82%) compared to the A/B test.

It did that by routing 64% of impressions to D · Headline D — the variant with the highest true CTR (5.0%). The A/B test only sent it 25%, by design. That's 5.7× less wasted spend.

This is exactly what TrafficLoopback automates on your owned traffic — no manual significance tests, no fixed splits, just continuous reallocation toward whatever is actually working.

How this works

Each impression is a Bernoulli trial: a click happens with probability equal to that variant's true CTR. Both strategies see the same kind of noisy world.
The A/B test gives every variant the same share of traffic for the whole run. Total clicks = sum of Bernoulli draws.
The bandit starts each variant at a Beta(1, 1) prior. Before every impression, it draws a sample from each variant's posterior and serves the one with the highest sample. Then it updates: alpha += click, beta += no-click.
Regret = (best true CTR × impressions) − clicks actually earned. Multiply by revenue per click and you get the dollars left on the table by serving losing variants.

Skip the math. Let a bandit run your tests for you.

TrafficLoopback runs Thompson-sampling bandits on your owned traffic, so you find winning ad creatives without manual significance tests. Free during beta.

Start free in beta

More free tools

Significance Calculator

Is your A/B test result real, or just noise?

Sample Size Calculator

How many visitors do you need before you can call a winner?

Break-Even Calculator

What CPC can you actually afford to pay?