The way that certain images, videos or concepts can suddenly spread like wildfire across the web, using email and social websites to propagate, is one of online culture’s most unique phenomena.

Now Spanish researchers claim to have found a way to accurately predict how quickly and widely new pieces of information, or “memes” as they are called, will spread. The ability to forecast this “viral” behaviour would be of great interest to sociologists and marketeers, among others.

The secret, they say, is to recognise the fact that people vary in how “infectious” they are when it comes to sharing content online. While some people pass on things they receive right away, others do so after some delay, or not at all.

Medical models

The viral spread of information online has conventionally been modelled using epidemiological tools developed to analyse the spread of biological viruses. One of the concepts borrowed is that of an infection’s R 0 , or basic reproductive number, which describes how many other people someone with the virus can be expected to infect.


Knowing the R 0 number help predict the likelihood and extent of real life epidemics, such as H1N1 swine flu. But models that apply the idea to online information can only indicate whether an internet meme is likely to be successful or to die out quickly, says Esteban Moro at the Carlos III University of Madrid, Spain.

Moro, working with José Luis Iribarren at IBM in Madrid, used IBM’s company email newsletter to show the importance of variations between people’s infectiousness in propagating memes online.

Email trail

They started a reward scheme offering prize draw tickets for recommending the newsletter by providing email addresses of other people and tracked how widely and quickly the recommendations spread. After two months it had reached 31,000 people.

But while people took 1.5 days to respond to a recommendation email on average, there was a huge variation at the individual level: some users responded within minutes, other in months, says Moro.

And only by combining some expectation of that variation with the R 0 number is it possible to build a model able to predict the meme’s spread. The team use a small chunk of the initial data on the content’s spread to predict how many people it will reach in total, and how fast. “Our model can give predictions within 1 per cent error once secondary reproductive number and human activity are estimated,” Moro says.

The model cannot predict whether a piece of content will go viral before it has been released; only its likely reach once it starts spreading. And the researchers think their approach to modelling should apply to information spreading via social networking sites and other online services as well as email.

‘Remarkable result’

Statistician Claudio Castellano, at the “Sapienza” University of Rome, calls the match between prediction and real result “remarkable”. He adds that there is other evidence to back up the idea people vary in online infectiousness.

For instance, David Liben-Nowell at Carleton College in Northfield, Minnesota, and colleague Jon Kleinberg at Cornell University last year traced an 11-year-old email chain letter to show up the differences between the spread of real viruses and viral information.

Moro’s study agrees with his own results, says Liben-Nowell. “Many models of information propagation discount both the role of time and [differences between] people.” But, there is more to discover, he says. For example, how people may vary in infectiousness depending on the type of content they receive.

Journal references: Moro and Iribarren study – Physical Review Letters (DOI: 10.1103/PhysRevLett.103.038702)

Liben-Nowell and Kleinberg study – Proceedings of the National Academy of Sciences (DOI: 10.1073/pnas.0708471105)