[This article was first published on, and kindly contributed to R-bloggers ]. (You can report issue about the content on this page here Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

News is starting to leak that the Large Hadron Collider may have accomplished its primary mission of confirming the existence of the hypothesised and heretofore elusive subatomic particle, the Higgs Boson. And sure, billions of Euros worth of state-of-the-art high-energy machinery and an army of experimental and theoretical physicists probably had something to do with the discovery. But did you know Statistics played a part as well? Check out this explainer video from PhD comics, below (an R chart even appears at the 00:27 mark):

The basic method the LHC uses to detecting the Higgs Boson is to generate decay products from subatomic collisions, and to generate charts like the one below:

Depending on whether or not the Higgs Boson exists, this chart will look different. But the difference between what this chart looks like given that the Higgs Boson exists, or given that it doesn't, is very very small:

As the physicist interviewed in the video says, to resolve this difference “what you need is a HUGE amount of data”. And we're talking REALLY huge: experiments like this are run 40 million times a second every day. (That's more than 40 terabytes of new data every day: now, that's Big Data!) Every day since the LHC was turned on, more evidence for the Higgs model has been accumulating, and it seems that now enough has accumulated for the researchers to be confident the Higgs Boson does indeed exist. Look for the formal confirmation in the next couple of days.