How Wikipedia can spot a box office smash a month before it is released

Researchers analysed the number of views and edits on a film's Wikipedia page before it was released

Say their model can predict success with upto 90% accuracy

Wikipedia can be used to predict the box office takings of a blockbuster movie a month before it is released, researchers have claimed.

Taha Yasseri, a physicist at the Budapest University of Technology and Economics, created a mathematical model that takes into account data such as the number of readers and editors for the Wikipedia page.

He found the figure correlates with takings on the film's opening weekend.



Researchers nalaysed films such as Iron Man 2, and found that the number of Wikipedia views and edits corresponded to it's opening weekend success at the box office

First weekend box office revenue in the U. S. against its predicted value at 30 days.

For the biggest movies in the sample - such as Iron Man 2, Alice in Wonderland, Toy Story 3 and Inception - Wikipedia proved more than 90% accurate, although the researchers admit that their overall success rate was 77%.

Predictions for less successful movies, such as Never Let Me Go, Animal Kingdom and The Killer Inside Me, varied more widely from what actually happened.

'We show that the popularity of a movie could be predicted well in advance by measuring and analyzing the activity level of editors and viewers of the corresponding entry to the movie in Wikipedia, the well-known online encyclopedia,' say the team.

Yasseri and his colleagues, Márton Mestyán and János Kertész, built the model using data on 312 movies with Wikipedia pages, out of a total of 535 that were released in the US in 2010.

Overall, the predicted box office takings matched reality with an accuracy of around 77%.

It was posted this week on the arXiv database.

'We were looking for the fingerprints of popularity of a movie,' said Yasseri.

The team analysed over 500 films, including big hits such as Toy Story 3

The Wikipedia entries of movies that were going to be popular were believed to be more heavily edited and visited by more readers.

Yasseri added that the model could be used by studios to help predict the potential success of their movies.

