Note: This 2014 article is now outdated. Please read a more current assessment of OVV’s estimates—together with our own, Caracas Chronicles estimate of the violent death rate—here.

And a popular hobby it is…

As Interior Minister Miguel Rodríguez Torres and the Observatorio Venezolano de Violencia battle it out over the 2013 murder figures, most Venezuelans shrug their shoulders and believe who they want to believe. Government supporters instinctively trust the minister, while the opposition takes it for granted that OVV’s much higher estimate must be right.

Oddly, in this hyperpolarized environment, the quality of that OVV number hasn’t gotten much scrutiny—and the OVV figure is not what it seems. It’s not a body count based on leaked government data, nor is it an estimate constructed from a proprietary survey. It’s a forecast based on past trends – a slightly more sophisticated version of the XKCD method.

In fact, the methodology section of OVV’s latest press release raises more questions than it answers. The forecasting techniques they mention require data both on past forecasts – i.e., 2010′s guess for 2011 – and on actual realized past values – i.e., the actual number of violent deaths in 2011. But since, by OVV’s own account, they don’t have access to reliable counts for at least the past five years, it’s not clear what they’re using as model inputs. There’s an oblique reference to “partial data from diverse regional and national sources,” but it’s not clear what those data are or how they’re used.

I reached out to Roberto Briceño León for clarification [disclosure: RBL is a friend and I had him read a draft of this post before publishing], hoping to be told, “No, chama, you misunderstood; our number is based on data from such-and-such source.” Nope. Instead, he said that my methodological criticisms were valid, adding, “Asi que nuestra metodología no es perfecta ni acorde a todos los cánones, pero qué podemos hacer, no hay otro modo de tener información y en condiciones de obscuridad es la poca luz que tenemos.” In other words: without reliable government statistics, what do you want from us except a simple forecast?

Well, let’s see: for starters, I’d like OVV to publish rather than hide the accuracy (or lack thereof) of their guesstimate. For instance, they tell us that “el rango de las afirmaciones podemos hacerla con un 95% de confianza,” but this doesn’t really make any sense, because they’re not publishing a rango, they’re publishing a number! That point estimate without a confidence interval is close to meaningless. Are they 95% sure that the interval 75/100,000 to 83/100,000 contains the true homicide rate? Or are they 95% sure that it’s between 20/100,000 and 138/100,000? For all we know, the government’s figure – 39/100,000 – is within OVV’s range.

Then they compound the problem by reporting what they’ve already told us is an estimate with extreme numerical precision – 24,763 deaths, rather than 24,762 or 24,764 – lending the figure an artificial air of exactitude. Not amusing.

And there’s a further methodological pasticho that has do with the inconsistent use of “homicides” and “violent deaths,” which look like synonyms but definitely aren’t. (For instance, if you’re killed resisting arrest, that doesn’t count as a murder for Rodríguez Torres’s purposes.) OVV used to report homicides, but their 2013 press release actually refers to violent deaths. So we end up with a case of apples and oranges: OVV’s 2013 (and 2012) estimates aren’t directly comparable to OVV estimates from previous years…or to Rodríguez Torres’s number.

These flubs are troubling because OVV insistently bills itself as an academic institution, which gives their estimates a scientific aura. That press release refers to “los investigadores de las siete universidades nacionales que integramos el Observatorio Venezolano de Violencia,” and Briceño-León is a respected and prolific professional sociologist. But claims to scientific status depend crucially on publishing detailed descriptions of methodology, allowing other researchers to scrutinize and replicate results. Why doesn’t OVV do this?

And, if OVV is wrong, what’s the real story?

Surely, we can’t just trust the minister’s number: with even less information about how Rodríguez Torres’s people arrived at their figure (insert obligatory condemnation of CICPC censorship), that would be foolhardy. So is there any reason – other than Rodríguez Torres’s say-so – to believe that the worst of the violence nightmare might actually be behind us?

In fact, I see three reasons to suspect that the answer might be yes … and none of them have to do with Plan Patria Segura.

First, little noted in the Guerra de Cifras, there actually is a third source of info on all this: the Ministry of Health (MPPS), which compiles and publishes birth and death records. And, perhaps surprisingly, the MPPS data show the violent death rate falling in 2009, 2010, and 2011 (the last available year; 2012 and 2013 haven’t been posted yet).

Of course, MPPS officials might have manipulated the data for political reasons – but if they did, they did so carefully, without leaving obvious tracks in the underlying micro data (which I have worked with for research purposes). Naturally, there’s a lot of room for error – and potential bias – in the way MPPS codes the death certificates that give rise to this data. Still, much of the administrative process is separate from the process that produces the CICPC data, which makes it better than nothing as a third source. Moreover, there isn’t an obvious basis to fudge it, since this data is so far off the political radar screen.

But let’s assume you just refuse to take any number the government publishes seriously on principle. Then are we stuck with the XKCD method?

Not at all. Rather than forecast the homicide rate based exclusively on the past trend – which, for OVV, is itself a series of rough guesses – you might try estimate the homicide rate based on hypothesized covariates. Roberto Briceño-León, for example, has made quite clear that he views impunidad as a key driver of the violent crime trend; he might therefore try to look at data on policing or apprehension rates.

Or take Quico’s hypothesis, which is that the volume of cocaine trafficked through Venezuela drives the homicide rate—not, as in Mexico, because cartels fight for the international trafficking business (Venezuela has only one big cartel), but because some of the cocaine makes its way to the domestic market, where street gangs fight for the local retail trade. If you believe this story, you might note that trafficking has declined as demand in the U.S. falls, and that could account for lower violence in Venezuela.

Other research links homicide rates to childhood lead exposure, pointing to evidence that the decline in leaded gasoline use in the United States in the 1970s produced the decline in US homicide rates in the 1990s.

This is yet another reason that Venezuela’s homicide rate might be declining: the childhood lead exposure of Venezuela’s malandreable-age cohort rose sharply through the 1990s (as crime increased) and then recently began to drop.

My point isn’t that either lead poisoning or cocaine trafficking is definitely responsible for the violence wave; these assertions would require serious quantitative evidence. My point is that there are lots of data researchers consider when trying to explain trends in homicide rates. Longstanding trends can turn, sometimes for reasons that have nothing to do with law enforcement.

But no new trends could show up in an OVV estimate that limits itself to extrapolating data from years when murder rates were rising. Actually, it’s not even clear why they needed to wait until December to publish their 2013 number: if all the data that went into their 2013 estimate was data they already had a year ago, why not publish the 2013 murder tally back then? For that matter, why not tell us right away how many homicides there will have been in 2014?

(A note on sources is after the jump)

So now that I’ve tugged OVV’s ear for methodological opacity, I better tell you where I got all my data: