For those of you from other parts of the world, there has been a small sensation over the weekend here about Cadburys chocolate randomisation. One of their products was a large chocolate egg accompanied by eight miniature chocolate bars, chosen randomly from five varieties. Public opinion on the desirability of some of these varieties is more polarised that for others.

Stuff reports:

But one family found seven Cherry Ripes out of eight bars and most of those complaining to Cadbury say they found at least six Cherry Ripes out of eight. Cadbury claimed that it was just bad luck saying the chocolates are processed randomly and the Cherry Ripe overdose was not intentional.

Both Stuff and The Guardian got advice on the probabilities. They get different answers: Martin Hazelton says seven out of eight being the same (of any variety) is about 1 in 10,000 and the Guardian’s two advisers say there’s nearly a 1 in 100 chance of getting seven Cherry Ripes out of eight (which is obviously less likely than getting seven of eight the same).

With a hundred-fold difference in the estimates, I think a tie-breaker is in order. Also, I’m going to do this the modern way: by simulation rather than by being clever. It’s much more reliable.

I’m going to trust the Guardian on what the five flavours were (since it doesn’t actually matter, I think this is safe). I’ve put the code and results for 100,000 simulated packages up here. The number of packs with seven or more bars the same was 44 out of 100,000. There’s obviously some random uncertainty here, but a 95% confidence interval for the proportion goes from 3 in 10,000 to 6 in 10,000, and so excludes both of the published estimates . Since computing time is nearly free, and the previous run took only 13 seconds, I tried it on a million simulated packs just to be sure, and also separated out ‘seven or more of anything’ from ‘seven or more Cherry Ripes’.

Out of a million simulated packs, 442 had seven or more of some type of bar, and 83 had seven or more Cherry Ripes. The probability of seven or more of something is between 4 and 5 out of 10,000 and the probability of seven or more Cherry Ripes is between 0.6 and 1 out of 10,000. It looks as though Professor Hazelton’s estimate of ‘a little less than one in 10,000‘ is correct for Cherry Ripes specifically. The Guardian figures seem clearly wrong. The Guardian is also wrong about the probability of getting at least one of each type, which this code shows to be about 30%, not the 7% they give.

I said I wasn’t going to do this by maths, but now I know the answer I’m going to go out on a limb here and guess that Martin Hazelton’s probability was, in maths terms, P(Binom(8, o.2)≥7), which is the answer I would have given for Cherry Ripes specifically. With Jack and Andrew in the Guardian I think the issue is that they have counted all 495 possible aggregate outcomes as being equally likely, when it’s actually the 32768 390625 underlying ordered outcomes that are equally likely.

The other aspect of this computation is the alternative hypothesis. It makes no sense that Cadbury would just load up the bags with Cherry Ripes and pretend they hadn’t — especially as the Guardian reports other sorts of complaints as well. We need to ask not just whether the reports would be surprising if the bags were randomised, but whether there’s another explanation that fits the data better.

The Guardian story hints at a possibility: clumping together of similar chocolates. It also would be conceivable that the randomisation wasn’t quite even — that, say, Cherry Ripes were 25% instead of the intended 20%. It’s easy to modify the code for unequal probabilities. Having one chocolate type at 25% doubles the number of seven-or-more coincidences, and more than half of them are now with Cherry Ripes. But that’s quite a big imbalance to go unnoticed at Cadburys, and it doesn’t push the probability a lot.

So, I’d say bad luck is a feasible explanation, but it could easily have been aggravated by imperfect randomisation at Cadburys.

Many lessons could be drawn from this story: that simulation is a good way to do slightly complicated probability questions; that people see departures from randomness far too easily; that Cadburys should have done systematic sampling rather than random sampling; maybe even that innovative maths teachers may have gone too far in rejecting contrived ball-out-of-urn problems as having no Real World use.