When Public Relations Departments Do Cybercrime Research

A recent article on ProPublica dissected two commonly quoted figures about cybersecurity: $1 Trillion in losses due to cybercrime itself, and $388 billion related to consumer monetary losses and time spent repairing the damage caused by cybercrime. Both figures, according to the article are hyperbole to say the least.

Let's start with the first figure. Authors Peter Maass and Megha Rajagopalan of ProPublica say that NSA Director General Keith Alexander has recently been using the $1 Trillion figure in speeches, as has Senators Leiberman and Collins, whose Cybersecurity Act of 2012 failed to be passed by the Senate this week. The most famous use of the figure, perhaps, was President Obama's 2009 Cybersecurity speech.

The $1 Trillion figure is attributed to anti-virus vendor McAfee, which first used it in its 2009 "Unsecured Economies: Protecting Vital Information" report, and again in its 2011 "Underground Economies: Intellectual Capital and Sensitive Corporate Data Now the Latest Cybercrime Currency" report. The original 2009 report was authored with researchers from Purdue University who insist that they didn't generate that number. Indeed, ProPublica couldn't find any of the original named contributors to the report who'd own up to that figure, until they reached Sal Viveros, a McAfee public relations official. "McAfee extrapolated the $1 trillion … based on the average data loss per company, multiplied by the number of similar companies in the countries we studied," Viveros told ProPublica.

Such an extrapolation would not hold up to statistical rigor.

As for the $388 billion in losses, which I previously quoted, Symantec breaks the figure down to $274 billion lost in time spent repairing the damage caused by cybercrime, and $114 billion in money stolen and money spent on resolving cybercrime.

According to ProPublica, "The report was not actually researched by Norton employees; it was outsourced to a market research firm, StrategyOne, which is owned by the public relations giant Edelman."

So McAfee is stretching the truth a bit, while it appears that Symantec is simply making stuff up.

The problem with both of these figures -- $1 Trillion and $388 billion -- is, as Microsoft researchers pointed out earlier this year, they are studded with outliers. Say one person reports having lost one million dollars due to an Internet phishing attack ... but the vast majority of losses is somewhere around a few hundred dollars each. That one million (the outlier) is going to skew the results higher than they would normally be if that one result were simply taken out (a common survey practice).

Better examples come from their report, fittingly titled "Sex, Lies, and Cybercrime." In one example they cite that a single individual who claims $50,000 losses, in an N = 1000 person survey, is enough to extrapolate a $10 billion loss over the population. In another, one unverified claim of $7,500 in phishing losses translates into $1.5 billion over the population.

The problem is, taking out the outliers further reduces the already limited number of respondents willing to report such losses. Yet reports with small significant numbers still see the light of day and, increasingly, get quoted as gospel. Researchers Dinei Florencio and Cormac Herley write “It is ironic then that our cyber-crime survey estimates rely almost exclusively on unverified user input. A practice that is regarded as unacceptable in writing code is ubiquitous in forming the estimates that drive policy.”

The Microsoft researchers concluded: "Are we really producing cyber-crime estimates where 75% of the estimate comes from the unverified self-reported answers of one or two people? Unfortunately, it appears so. Can any faith whatever be placed in the surveys we have? No, it appears not."

Something else may be going on here. At both DEF CON and even at Black Hat last week, the Feds seemed to be signaling an S.O.S., asking private enterprises and even individuals to help the government ward off future cybercrime attacks. So perhaps using these scary numbers, perhaps enlisting the attribution of household-name security companies, is strategic.

If so, then wouldn't it be better if those same household-name security companies took the initial care to get their figures right in the first place?

Editor's Note: Symantec provided the following statement to SecurityWeek in response to this column:

We are by no means “making stuff up.” The survey data, on which the Norton Cybercrime Report is based, comes from direct consumer response and self-reported data, as clearly stated in all of our communication about this survey and also mentioned in your previous blog post.

We stand behind the Norton Cybercrime Report and its methodology. Self-reported data is a standard research method and the data is normalized by sampling across a large number of adult consumers, nationally representative in each of the 24 countries where the survey took place. Governments, businesses and not-for-profits use a similar approach of doing household studies using sampled populations when making estimates of everything from expenditure to disposable income to online habits.

Correction: Story was updated to reflect instances were $388 billion incorrectly read "million", and added a breakdown of the $388 billion figure provided by Symantec.

Related: Sex, Lies and Cybercrime Surveys - Exaggerations Cloud Reality