2 — The Pentagon’s Iraq mortality pseudoscientist

In his paper published in Defence and Peace Economics in 2010, Prof. Spagat reaches the startling verdict that the 2006 Lancet survey “cannot be considered a reliable or valuable contribution to knowledge.” Yet close analysis reveals the same types of analytical sleights-of-hand he has perpetrated for the Colombia conflict.

In what follows, I interrogate the specific arguments of Prof. Spagat’s landmark critique of the 2006 Lancet survey, to explore whether they really support that conclusion.

AAPOR

Spagat’s first line of attack was in the observation that the Lancet study was in breach of several rules in the American Association for Public Opinion Research (AAPOR) Code of Professional Ethics & Practices. He followed this up with scrutiny of the questionnaires used and the overall study design. This is perhaps the most compelling element of Spagat’s critique, and contains valid points. However, their implications are over-stated.

As the Washington DC-based doctors group, Physicians for Social Responsibility (PSR) pointed out in its report last month, “the Lancet authors are not even members of the association,” and so were under no ethical obligation to adhere to its rules.

Disclosure

In particular, Spagat cited AAPOR to claim that the study’s lead author Prof. Gilbert Burnham has unreasonably refused to make either questionnaires or the data for the study available for inspection.

Yet Spagat ignored the fact that, according to the New Scientist, “Burnham has sent his data and methods to other researchers, who found it sufficient.” In fact, Burnham, New Scientist reported, had been told by his superiors at the John Hopkins Bloomberg School “not to supply AAPOR with the requested additional material since neither Burnham nor the Bloomberg School are AAPOR members, and therefore AAPOR had no right to play judge in this case.”

One main reason all the data has not been made available to others is because, as Spagat points out, in some cases the actual questionnaires used contained identifying markers. Their release, as the PSR report also notes, would have violated confidentiality. These identifying markers should not have occurred as a matter of ethical procedure, but had been missed by the lead researchers due to language issues. Bloomberg School censured Burnham for this error, but noted that no one had been harmed precisely because Burnham had ensured that the data itself was kept confidential and not simply handed out to the wider scientific community.

Spagat’s insinuations regarding potentially nefarious motives for not fully disclosing all the data, departed fundamentally from the conclusion of the Bloomberg School’s internal investigation of the study, which he himself cites only selectively.

After a careful review of the 1,800 original questionnaires, the Bloomberg School investigation concluded that the questionnaires were indeed authentic, and found no evidence of falsification, fabrication, or manipulation: “The information contained on the forms was validated against the two numerical databases used in the study analyses. These numerical databases have been available to outside researchers and provided to them upon request since April 2007. Some minor, ordinary errors in transcription were detected, but they were not of variables that affected the study’s primary mortality analysis or causes of death. The review concluded that the data files used in the study accurately reflect the information collected on the original field surveys.”

Doorsteps

Spagat goes on to make much of the ethical conundrums of the Lancet team interviewing Iraqis on their doorstep, which he speculates would have put them at risk from local militias:

“Approaches to potential respondents were essentially public events at the local level and could often have been known by local militias or criminals. A person could answer the door and refuse to be interviewed but he or she might still not be able to demonstrate to intimidating observers that he or she had truly refused. Local militia members, for example, may have simply assumed that someone who had been approached by the survey had disclosed information detrimental to the interests of the militia. Such an individual might have suffered simply from answering the door, regardless of whether or not he or she had actually consented to be interviewed.” [emphasis added]

Spagat’s critique is replete with this sort of speculative conjecture. I asked several Iraqis who had lived in the country during the post-invasion period what they thought of Spagat’s criticisms, and they all agreed that they were largely irrelevant.

If Iraqis felt they should not conduct doorstep interviews due to fears regarding safety, they said, they would simply not answer the door. The fact that interviews were conducted on doorsteps in this manner itself demonstrated that the respondents themselves were perfectly comfortable with conducting the interviews in a relatively public context. Given that Spagat fails to provide any actual evidence that the procedure resulted in anyone coming to harm, it is not clear that the procedure in fact posed any risk. If it had done, Spagat should have been able to substantiate this by pointing to evidence of the risk materialising in actual harm. Since he did not, this suggests that the risk was indeed minimal, a notion supported by the Iraqis I spoke to.

The concern for confidentiality is a reasonable question, but there is no simple solution: If interviewers insisted on being permitted to enter homes, or did not establish that sort of climate of trust, it could have generated other unacceptable security risks for both the research team and respondents.

Neighbourhood trust-building

Spagat also made much of the fact conceded by Lancet authors Burnham and Roberts that to minimise risks to the interviewers, they allowed locals, including children, to inform the wider neighbourhoods of their presence, and wore white-coats to identify themselves.

For Spagat, this created the following problem:

“By encouraging neighbours, with a particular emphasis on neighbourhood children, to explain the purpose of the study, the field teams set in motion uncontrollable dynamics that may have distorted the perceptions of L2’s [Lancet 2006] potential respondents. It is no longer possible to reconstruct how individual participants, many of whom would have first learned about the study from a neighbour (adult or child), understood the purpose of the study at the moment they consented to be surveyed.”

Yet as Burnham explained, the sole purpose of informing locals and children within “a cluster of houses close to one another” was to ensure that it was understood that they were not affiliated to any militias, or the government, but were neutral scientists conducting a survey. In any case, the specific purpose of the survey was explained directly to respondents before consent was received. Spagat raises speculative questions, but once again provides no actual evidence to suggest that this procedure either endangered locals or fatally distorted perceptions of those surveyed.

On the contrary, the Iraqis I interviewed, who are personally familiar with the tense environment of the time, said that this procedure would have greatly aided the research team in establishing trust with the local communities to obtain access to households. Spreading the word among locals and children within particular household clusters would have helped assure potential respondents that their intention was indeed benign, and had no relationship to local sectarian or governmental interests.

White-coats

The use of “unusual” white-coats, Spagat says further, in such a public context would have undermined the confidentiality of the interviews. However, he overlooks the fact, as my Iraqi sources explained, that by spreading the word of the survey relatively in advance, the research team established a local climate that was far more conducive to the safety of respondents in the event that they agreed to conduct the doorstep interviews. Ultimately, locals themselves would have been far more aware of potential local risks than the interview team, and this procedure gave them the opportunity to decide whether they were comfortable or not with being interviewed.

As above — the concern for confidentiality is a reasonable question, but there is no simple solution: If interviewers insisted on being permitted to enter homes, or did not establish that sort of climate of trust, it could have generated other unacceptable security risks for both the research team and respondents.

All the points raised here by Spagat offer valid questions about how to improve such procedures and minimize risks in conflict environments, and suggest that more could have been done. On the other hand, Spagat ignores that in the context of the conflict there were inevitably trade-offs to be made. Ultimately, though, most of his criticisms are speculative and based on theorizing risk dynamics without evidence or understanding of the actual nature of militia activity in Iraq.

George Soros

Surprisingly, given Spagat’s own systematic reticence in disclosing his own conflicts of interest and funding sources, he alleges that the 2006 Lancet authors failed to disclose that their research was funded by the Open Society Institute (OSI) of George Soros. Spagat quotes an article in the National Journal which reported that OSI funds had come through the Massachusetts Institute of Technology (MIT).

But Spagat here repeats a falsehood disproved two years earlier. As MIT’s John Tirman told the British journalism watchdog Media Lens in January 2008:

“Open Society Institute funded a public education effort to promote discussion of the mortality issue. The grant was approved more than six months after I commissioned the survey, and the researchers never knew the sources of funds. As a result, OSI, much less George Soros himself, had absolutely no influence over the conduct or outcome of the survey. This was told to the authors of the National Journal article at least twice. One must conclude that their misrepresentation of this — among many other issues — was intended to sensationalize their version of the story and color the readers’ opinion about ‘political bias.’ This is contemptible malpractice on their part.”

The survey had been commissioned long before the Open Society Institute funding, with internal funding from MIT’s Center for International Studies. Further, the OSI funding had nothing to do with supporting the actual survey or funding its research. Correspondence between Tirman and the author of the National Journal article, Neil Munro, obtained by Media Lens proved that Munro had been fully aware of these facts, but went ahead and printed his falsehood anyway.

Neil Munro is now employed by The Daily Caller, the right-wing online news site launched by a former chief policy advisor to Vice President Dick Cheney. This is Munro violating basic journalistic norms by heckling Obama during a morning press briefing about immigration.

Despite this being in the public record for years, Spagat went ahead and further repeated Munro’s falsehood despite its being discredited.

Data entry

While independent experts and the Bloomberg School’s own internal investigation all confirmed that the questionnaires used for the survey were entirely appropriate for the research, Spagat — who unsurprisingly was not granted access — managed to obtain “the English-language list of questions and a data entry form” from Munro, who claims to have received them from “a third party who had apparently obtained it from an L2 author.” He then subjected these to a critique.

Every source in this chain is questionable: Spagat himself, whose conflicts of interest and vested interest raise all sorts of questions; Neil Munro, who demonstrably lied in his National Journal article; and the unidentified “third party” who supposedly obtained these materials from an unidentified Lancet author.

Spagat spends some time picking apart the form trying to prove that inexact wording would lead to imprecise questions and, therefore, potentially misleading answers. All this is irrelevant, though, because Spagat is ultimately analysing materials he obtained from a journalist who deliberately reported a falsehood to defame the study in question. There is no independent evidence proving that what he is analysing had anything to do with the Lancet survey.

Main street

After raising these ethical questions, Spagat attempts to get substantive. He reiterates the ‘main street bias’ argument, which argues that as most of those interviewed were families residing on main streets, this would lead to an overestimate of deaths because families were at greater risk of violence on main streets as opposed to anywhere else.

Spagat references an earlier paper where he and his co-authors attempt to show that main street bias could lead to overestimates by a factor of 3. But a working paper version by Spagat putting forward the same argument was quickly demolished by science blogger Dr. Tim Lambert of the University of New South Wales, who writes for National Geographic’s scienceblogs.com.

Lambert pointed out, citing a conversation between Prof. Stephen Soldz and Dr. Jon Pedersen — head of research at the Fafo Institute for Applied International Science — that “if there was a bias, it might be away from main streets [by picking streets which intersect with main streets].” Pedersen, Lambert noted, “thought such a ‘bias,’ if it had existed, would affect results only 10% or so.”

The biggest problem with the main street bias argument is the strange presumption that most Iraqis were killed at their homes on main roads. Spagat et. al fail to provide any evidence for this presumption beyond frivolous speculation, and references to IBC data. The latter, as seen above, is deeply selective and cannot offer any basis to extrapolate trends of violence that are not simply statistical artifacts, as Ball et. al have proven decisively.

Once again, Spagat and his colleagues rely on their own ignorance of the real dynamics of the Iraq conflict on-the-ground, and the credulity of readers who are looking to the experts for that sort of insight.

As the Lancet authors rightly point out, most violence in Iraq occurred in public spaces. A car bomb exploding in a market, for instance, would injure and kill people from throughout the neighborhood. Attacks consisted of air strikes, random shootings, shelling, raids, and death squad assassinations. Iraqis were killed on rooftops, on roadsides, at check-points, in markets, shops, as well as in their homes. The assumption that violence was concentrated disproportionately to target homes on residential main roads should never have made it into an academic journal.

As Lambert concluded, “the only way” Spagat et. al were able to make ‘main street bias’ significant as a “source of bias was by making several absurd assumptions about the sampling and the behaviour of Iraqis.”

Success rate

Following from the main street bias theory, Spagat’s next move is to compare the 2006 Lancet’s success rate in finding respondents at home for interview, to the success rate for previous studies — primarily, the first 2004 Lancet study; a UN Development Program survey conducted in partnership with the Iraqi Ministry of Planning and Development Corporation, called the ‘Iraq Living Conditions Survey’ (ILCS); and a survey conducted by the World Health Organization (WHO) with the Iraqi Ministry of Health (MoH), the ‘Iraq Family Health Survey’ (IFHS).

He argues that since the 2006 Lancet survey reported a very high success rate, while the previous surveys had a lower success rate, something must be amiss: in the period between the older and newer surveys, hundreds of thousands of Iraqis fled or were displaced, so we would expect that the research team would be less successful in finding people at home to interview, not more. He then attempts to quantify the odds of the Lancet’s claimed success at finding more respondents at home than the other studies. Those odds are at least “190 to 1,” and more likely “nearly 100,000 to 1.”

Spagat concludes:

“… these comparisons provide some evidence of fabrication and falsification both in L2’s reported success rates in visiting selected clusters and in L2’s reported contact rates with selected households.”

Ironically, the most obvious explanation for the higher success rate of the Lancet authors is mentioned by Spagat himself. As the research team made a point of giving locals within a certain cluster of households a degree of advance notice to increase trust and minimize risks to the research team, they increased the probability that locals would choose to be at home. This easily explains why the 2006 Lancet study, the only Iraq death toll survey to have used this procedure, had a marginally higher response rate than the other studies.

This example in itself demonstrates the specious nature of Spagat’s critique. Based on unfounded assumptions and generalizations, the ‘improbabilities’ he calculates as evidence of falsification or fabrication are themselves merely statistical artifacts that ignore the complexities of the real-world.

The other issue, of course, is his curious selectivity. Spagat uses other studies as a baseline for normality, but fails to subject their methods to the same level of scrutiny. He therefore has no idea whether those studies offer an accurate baseline.

As noted in a paper led by Prof. Christine Tapp of Simon Fraser University published in the journal Conflict and Health, there were significant limitations with the ILCS. The survey was “conducted barely a year into the conflict,” had “a higher baseline mortality expectation, and differing responses to mortality when houses were revisited.” Tapp et. al note that criticisms of the study acknowledged by its authors, which Spagat ignores, include “the type of sampling, duration of interviews, the potential for reporting bias, the reliability of its pre-war estimates, and a lack of reproducibility.”

It is not clear, then, that its lower response rate should be taken as an ideal norm, against which to test the Lancet’s. Similar concerns apply to the IFHS, where serious questions have been raised about the role of Iraqi Ministry of Health officials in administering and undertaking the research in a context of sectarian warfare. The survey team even introduced themselves to respondents as being from MoH. As the PSR report shows, significant evidence exists that Iraqi government health officials have knowingly downplayed the civilian death toll throughout the conflict. Why then, should it be assumed that the IFHS response rate is representative of reality on-the-ground, as opposed to the sectarian politicization of the MoH research team?

Extrapolation

Prof. Spagat then attempts to show that the 2006 Lancet authors artificially extrapolated their results from previous studies, which he demonstrates with a graphic. His procedure is to argue that two previous mortality surveys from Kosovo and the Democratic Republic of Congo are “in near perfect alignment” with the findings of the new Lancet study, suggesting no less than “data falsification.”

Spagat’s cherry-picked proof of extrapolation

This claim had been previously published elsewhere by Spagat in an early version of his 2010 paper. The argument was roundly refuted by Dr. Lambert, who used the same method to plot “extra points” and similar regression lines for other conflicts (including Bosnia, Darfur, and the first Lancet survey). Lambert then drew similar regression lines. This test effectively replicated Spagat’s results, showing that his ‘proof-of-extrapolation’ line was yet another meaningless statistical artefact.

Lambert replicated Spagat’s method to show that similar regression lines could be drawn using other conflicts

Lambert concluded:

“With all those points and lines, it’s not hard to find two points that line up with one of the four possible points of L2. (In fact, as well as the ones Spagat shows, L1 and DRC also work). Once you’ve done this, all you have to do is erase all the other points and lines to wind up with Spagat’s graph. Needless to say, doing this is dishonest cherry picking, especially when you are doing it to accuse researchers of fraud.”

At that time, Spagat had also said: “This graphic was passed to me by researchers who asked to remain anonymous.” He chose not to disclose this in his Defence and Peace Economics paper.

Lambert concluded: “Well, the graph does suggest that there has been some data manipulation going on, but the manipulators are not the L2 authors… Since Spagat didn’t produce the deceitful graphic, he isn’t guilty of fraud, just of incompetence. As is David Kane, who describes Spagat’s paper as a ‘tour de force.’”

According to Prof. Andrew Gelman, director of the Applied Statistics Center at Columbia University, Lambert’s rebuttal of Spagat’s graph was compelling. Gelman himself is critical of the 2006 Lancet study and had accepted Spagat’s arguments that the study is unreliable, but noted:

“When I saw [Spagat’s] graph on page 16 (in which three points fall suspiciously close to a straight line, suggesting at the very least some Mendel’s-assistant-style anticipatory data adjustment), I wondered whether these were just three of the possible points that could be considered. Investigative blogger Tim Lambert made this point last year, and having seen Lambert’s post, I don’t see Spagat’s page 16 graph as being so convincing.”

Although the criticisms of Spagat’s graph were published earlier, he did not bother to address them in the 2010 iteration of his paper in Defence and Peace Economics.

Supervision

Spagat also raised questions about the supervision and qualifications of the interview team, noting that the on-the-ground supervisor, Riyadh Lafta, “is not available to answer questions about how he supervised the L2 team.” This gives him leeway to open up all sorts of questions to cast doubt on the integrity of the supervision process. Once again, he cites National Journal’s Munro as a basis for his concerns.

Yet as the Lancet authors had already explained, Lafta’s silence was due to “concerns for his safety and that of his family.” Lafta had been smeared in the National Journal article as a senior official in the Saddam government, working in the Ministry of Health, when in fact, he was “part of the university system not the Ministry of Health. He was one of the very few doctors who refused to join the Ba’ath Party under Saddam.” Lafta had consulted for the UN several times, and his research was widely respected.

Despite Spagat’s claims, and he gives no indication in his paper that he attempted to ask the Lancet authors themselves, the latter had also confirmed in the public record that all the interviewers were qualified physicians with previous survey and community medicine experience, and former students of Lafta. They had all also received two days of training. Lafta had personally accompanied some of the survey teams in the field. The survey teams had also taken several pictures in the field, which cannot be released due to security concerns.

Workload

Spagat’s next substantive point is that the workload of the 2006 Lancet interview team was so time-pressured that questions as to whether it was even possible for the interviewers to conduct the survey properly and without fabrication are valid. He refers primarily to an argument made by his frequent co-author, IBC director Madelyn Hicks, which purportedly shows that “it is implausible that the teams could have worked on such a punishing schedule while maintaining acceptable ethical standards.”

As the PSR report noted, though, Hicks had “overlooked the fact that two teams both consisting of two women and two men carried out the study — information that is also registered in the study report.”

In response to their critics, the Lancet authors explained in Nature in 2007: “1,849 interviews in 49 days described in our study suggest that 38 interviews had to be conducted each day by our eight interviewers.” PSR notes that: “For the most part, the teams informed the families about their plan beforehand through local children, and in the households where there were no deaths — the overwhelming majority, of course — the polled family only had to answer five questions.”

Inconsistencies

Spagat also makes much of inconsistent responses from the lead Lancet authors on how their teams conducted the survey and ensured that they did not over-sample ‘main streets’, suggesting that this is evidence of the “implausibility” of their claims at best, and outright fabrication at worst.

This is perhaps the only truly valid criticism Spagat musters in his paper, and underscores lessons to be learned in the design and implementation of cluster sampling. But the key question is whether this would justify dismissing the survey’s reliability entirely, or condemning it as falsified or fabricated. Leading statistician Prof. Gelman, who was largely convinced by Spagat that the Burnham et al. study was untrustworthy, still disagreed with Spagat on whether the inconsistent responses proved the survey’s findings were fraudulent.

“It’s surprisingly difficult for people to write exactly what they did,” he explained with reference to providing details of sampling methods. The contradictory descriptions of these methods did not necessarily prove fraud, it could also be “evidence that they don’t fully understand cluster sampling (which actually is a complicated topic that lots of researchers have trouble with), or evidence that their sampling was a bit of a mess (which happens to the best of us) and that they didn’t do a great job explaining it.”

Gelman later clarified:

“The study looked reasonable to me… Burnham et al. provide lots of detail on the first stage of the sampling (the choice of provinces) but much less detail later on… Unfortunately, it is a common problem in research reports in general: to lack details on exact procedures; it’s surprisingly difficult for people to simply describe exactly what they did… This is a little bit frustrating but unfortunately is not unique to this study. Unfortunately, I’d still have to go with this general position: it’s common to not share data or methods (indeed, as anyone knows who’s ever tried to write a report on anything, it can be surprisingly effortful to write up exactly what you did), so that alone is not evidence of a serious flaw in the research.”

In any case, the contention that over-sampling of main streets invalidates the survey’s findings is untenable — as scientist Jon Pedersen noted, in reality the margin of error generated would not fundamentally alter the survey’s overall findings.

Death certificates

Spagat’s final main line of attack regards death certificates. The 2006 Lancet team requested death certificates for 87% of cases — in the other cases, interviewers forgot to request these. They confirmed that respondents supplied death certificates in 92% of all the cases for which they asked.

This issue provides a particularly powerful example of the manipulative way in which Spagat abuses statistics:

“The very high number of estimated deaths in L2 implies that the official death certificate system has issued, but failed to record the issuance of, about 500,000 death certificates during the L2 coverage period. This forces L2 into a very delicate balancing act. For the death-certificate data to be valid it must be the case that Iraqi authorities issue death certificates for virtually all violent deaths and yet that same system fails to record the fact that death certificates have been issued roughly 90% of the time. Alternatively, it could be that Iraqi Ministry of Health is engaged in a massive and highly successful cover-up of deaths that have actually been documented through death certificates. This seems unlikely.”

This argument essentially regurgitates (without attribution) Sloboda’s critique aired courtesy for USIP in 2007, which resurfaced again in the Munro article that Spagat frequently quotes as authoritative.

But this could only seem credible to someone unfamiliar with empirical realities concerning Iraq’s death certificate system. As John Tirman points out in his book, The Deaths of Others (Oxford University Press, 2011):

“Death certificates may be issued to the bereaved by a local doctor but not recorded by anyone else; the information flows in such circumstances are notoriously unreliable, and the health care system itself — those doctors and others who would issue death certificates and then collate and send the data to the Ministry of Health — was under intensive attacks.”

As Nature reported, the Lancet team had themselves pointed out early on that:

“… the process for issuing death certificates still works well in Iraq, but the system for monitoring the number of certificates issued does not. Even before the war, note the researchers, the government’s surveillance system captured only one-third of all deaths.”

What Spagat, Sloboda, and colleagues have consistently, and somewhat laughably, ignored is that the system for issuing death certificates locally was distinct from the system for recording them at a national level. Doctors were able to issue death certificates locally using a paper-based system, and would then need to separately attempt to have the certification logged centrally. The entire process of issuing and registering death certificates at a national level was (and remains) a long, bureaucratic and manual process. This means that it was perfectly possible for local doctors to issue death certificates locally without them being recorded in the central national database. During the height of the conflict, this was highly probable.

Spagat, however, prefers to take it on faith that the system should have functioned perfectly and synchronously, with issuance of death certificates by local medics correlating unfailingly with the national registration system in the midst of an unprecedented invasion, occupation, and sectarian civil war.

Contrary to Spagat’s unwarranted assumptions, what little evidence does exist of Iraq’s death registration system in this period shows how disparate the local and national systems often were.

As one UN-funded study of Iraq’s civil registration system in Canadian Studies in Population observes:

“On the vital statistics level, some offices were looted, with some infrastructure destroyed, and some healthcare personnel were lost. Both the loss of personnel and the destruction of infrastructure negatively affected the registration of vital events during this period and up to about 2007. From 2007 onwards, the vital statistics system has slowly recovered. It is believed that by 2011, the recording of vital events has returned back to its normal level.”

This explains why hundreds of thousands of Iraqis were able to obtain death certificates that were not registered on the Iraqi government’s national database.

Death certificate patterns

Spagat attempts to take the death certificates ‘anomaly’ further by searching ardently for patterns in the way the Lancet team recorded the production of death certificates.

“For violent deaths, all failures to produce death certificates when asked were in a single governorate, Nineveh, whereas for non-violent deaths these failures were spread across eight governorates,” writes Spagat. “It is implausible that the system of issuing death certificates and families taking care of them is nearly perfect in all but one governorate in the case of violent deaths whereas these systems are less reliable for non-violent deaths in eight governorates.”

He also makes much of the calculation that the research team was six times more likely to forget to ask for a death certificate for non-violent deaths, than for violent deaths. It would hardly be surprising, though, if the team unconsciously assumed it more important to verify certification for violent deaths than non-violent deaths.

Nevertheless, all this gives Spagat grounds to calculate the odds of the Lancet team obtaining the death confirmations as astronomically small:

“(1) Using the [lower] death-certificate confirmation rate for L1 [the first Lancet survey in 2004] of 80% and assuming statistical independence across deaths, the odds against 180 confirmations in a row are 2.7×1027 to 1. In fact, a more direct comparison is possible for the violent deaths recorded in L2 and occurring during the L1 coverage period, i.e. through September 2004. L2 claims a perfect record of 60 confirmations in 60 attempts for violent deaths during the L1 sampling period, for which we can calculate odds of more than 650,000 to 1 against. (2) Using the confirmation rate for non-violent deaths in L2 of 92%, the odds against are more than three million to 1. (3) Even if we arbitrarily and implausibly assume a 0.98 probability that death certificates can be produced for each violent death we still get odds of 38 to 1 against.”

He therefore concludes that “there is likely fabrication in the death-certificate data in L2 and that these data do not give reliable support to L2’s very high estimated death rate.”

As usual, Spagat’s numbers rest on reprehensible ignorance of the conflict, whose local dynamics provide reasonable explanations for these phenomena.

Regarding the fact that the second Lancet survey obtained more death certification confirmations than the first 2004 survey, Spagat’s argument is that the second should have received less, due to displacement and refugees. But the second survey was conducted much longer after the 2003 ‘shock and awe’ invasion, and therefore in less fraught circumstances that could have been more conducive to enabling the issuing of death certificates.

While victims’ families would usually seek death certificates from local medics in the case of violent deaths to seek compensation, their ability to do so would depend on the circumstances of the violent deaths. Often the bodies of civilians killed violently would be released quickly and directly to families, bypassing formal registration processes. Local politics and sectarian tensions could have easily dissuaded local doctors from issuing death certificates for violent deaths for fear of retribution, especially with ongoing pressure from Ministry of Health (MoH) officials to downplay violent deaths — and such risks could have been worse at MoH hospitals.

At the time, the Iraqi Ministry of Health was dominated by forces loyal to Muqtadar al-Sadr, who controlled Shi’ite death squads. MoH hospitals functioned as outposts for the Shi’a militia, and often Sunnis were even refused treatment or shot in their beds. Hospitals and ambulances were being used to conduct killings and kidnappings. Depending on the locality and sectarian affiliations of those seeking certification, obtaining death certificates attributed to violent deaths of a sectarian nature, or due to coalition actions, would not have been a simple task.

Nineveh has a significant Sunni population, and both US and US-backed Iraqi forces faced considerable challenge from insurgents there after the 2003 invasion. Like Fallujah, Nineveh had been subjected to particularly sustained levels of invasive US counter-insurgency violence. After Fallujah, many insurgents moved to Nineveh. The failures to produce death certificates for violent deaths could easily have been directly related to the inability to obtain certificates due to such sectarian tensions. Obtaining death certificates for non-violent deaths, conversely, would face no such tensions.

As for many respondents’ inability to produce death certificates for non-violent deaths across other governorates, the reality is that a major incentive to obtain a death certificate would be to seek compensation for a violent death. In cases of non-violent death, that incentive would not exist. Outside Nineveh, with sectarian tensions less prevalent, families would likely have found it much easier to seek death certificates for violent deaths, and would have been incentivized to do so to seek compensation.

In other words, the general distinction between Nineveh and other governorates, if indeed valid, speaks to variations in the violence between the governorates, its impact on local sectarian tensions, and how this interplayed with the way families chose to engage with local death certification. Put simply, the variation illuminates the extent to which the impact of war in Nineveh seemed to generate violent deaths that family members were reluctant to certify.

The 2006 Lancet survey’s death certificate findings, then, are entirely consistent with the sectarian tensions that were localized and concentrated in particular areas depending on resident ethnic and tribal groups, and the political impact of conflict. But Spagat must pretend that such complexity doesn’t exist in conflict zones for his probability odds to have any validity.

Outlier

Spagat’s last resort is to compare the 2006 Lancet findings with the estimates of other studies, and particularly with information from the IBC database. His various points boil down to essentially two arguments: firstly, that the parity between numbers from hospital records, morgue records, most of the other reliable surveys, and IBC’s total record of deaths by armed violence, increases the likelihood that those lower numbers are reliable; secondly, that the vast gulf between the Lancet study’s implications for the scale of violent incidents, and the level of violence recorded in the IBC database for such incidents, shows that the Lancet numbers are “implausible.”

The problem here is that there is ample evidence that Iraqi hospital and morgue records, along with both Western and Arab media outlets, severely undercounted the violence. Spagat expresses his incredulity at the idea that 95% of car bombs could go unreported even in Baghdad, although as already noted, journalist Dahr Jamail confirmed from his on-the-ground experience during the conflict that the vast majority of violent incidents in Baghdad did, indeed, go unreported.

Iraqi morgues were also notorious for inconsistent or non-reporting. Victims of US or terrorist killings were rarely taken to morgues, resulting in substantial undercounting of deaths. An employee of Baghdad’s central morgue statistics office, for instance, told US National Public Radio (24 February 2009) that since 2004:

“By order’s of the minister’s office we cannot talk about the real numbers of death… The minister would say [on the news] 10 people got killed all over Iraq, while I had received in that day more than 50 dead bodies just in Baghdad. It’s always been like that — they would say one thing, but the reality was much worse.”

Reporting from the al-Adhamiya area of Baghdad, Jamail interviewed the manager of a park which was now an unofficial cemetery carrying 5,000 graves. Neither the government, nor the media, were keeping count of these deaths, the manager told Jamail. Although he kept the names of the dead in a logbook, he said, “We’ve never had anyone come from the government to ask how many people are here. Nobody in the media nor the Ministry of Health seems to be interested.”

“Such graveyards, and there are many, raise questions about the real death toll in Iraq,” wrote Jamail. “The unofficial cemeteries around Iraq hold their own additions to the numbers doing the rounds. And no one knows what these add up to.” Elsewhere, Jamail confirms that this was even more the case outside Baghdad, where most media outlets dared not venture.

Iraqis look at rows of graves at a makeshift overflowing cemetery built in a soccer arena in Fallujah (May 2004)

Thus, the parity between IBC’s database and other studies proves little. There are strong reasons to suspect that MoH, itself complicit in much of the violence in Iraq, was politically motivated to undercount deaths, and underplay the levels of violence. Working hand-in-hand with its US benefactors, the MoH data simply cannot be taken at face value. Neither can data from morgues and hospitals in an environment of extreme sectarian conflict, where in many cases unembedded reporters like Jamail directly confirmed that thousands of bodies at a time were not even making it into morgues or hospitals.

Based on these facts, it is entirely plausible that the vast majority of violence across Iraq went unreported in the media, and unrecorded by any official authorities.

Yet Spagat proceeds with the following sort of analysis:

“… it still seems unlikely that there were at least three separate shooting incidents in which US soldiers killed residents of four households in this small neighbourhood of 40 contiguous households within a span of 17 months.”

The only basis for such disbelief is a lack of understanding of the nature of the occupation regime imposed in Iraq, particularly in areas like Nineveh where insurgency was rife.

A second example:

“The final death attributed to the US Army is a three-year-old boy claimed to have been crushed by an American military vehicle in August 2005 with death certificate confirmation. This death does not appear in the IBC database although it is a newsworthy incident if true.”

The simple reality is that while US and British atrocities against civilians did receive some coverage, thanks to the efforts of both the coalition occupiers and their imposed Iraqi government most of such atrocities were not newsworthy.

The vast majority of reporters in Iraq during the war were embedded with coalition military units, and therefore subjected to heavy de facto reporting constraints. Even those who were unembedded were largely unable to travel beyond Baghdad due to security issues.

A third example:

“Cluster 34 contributed about 48,000 violent deaths blamed on US forces to L2’s central estimate, roughly 100 times the number of civilian deaths fully or partially attributed to US forces by IBC in the entire governorate of Nineveh… To summarise, if the Cluster-34 data are true, the behaviour of US soldiers within the cluster was much worse than the behaviour throughout the whole of Iraq both of US soldiers themselves and of all other agents.”

This paragraph reveals just how ideological Spagat’s approach is. He can only make this assumption on the basis of the IBC database, which as we have seen, is less likely to document especially intense episodes of violence committed by coalition forces.

The ‘dirtier’ the armed violence by our own governments, the less likely media will be able to report it — and especially in contexts where coalition forces are carrying out extreme counterinsurgency violence, embedded media units would not be permitted to witness and report such atrocities.

In summary, this means that large-scale massacres by US forces in particular were unlikely to be picked up in the IBC database at all — and therefore one cannot cite the database as meaningful evidence regarding the patterns of behaviour of US soldiers.

Conversely, the picture of US forces committing frequent large-scale unreported massacres is one that has been confirmed by numerous testimonials from soldiers who served during the conflict.

Iraq Veterans Against War (IVAW) has compiled thousands of pages worth of testimony from US soldiers documenting how coalition forces: established “free fire zones” in civilian areas where there were supposedly “no friendlies”, but where no enemy combatants were in sight, just civilians; fired at anyone digging near road-sides; imposed night-time curfews on cities then shot at anything that moved in the dark; deliberately used Iraqi children as human shields; fired indiscriminately at civilian cars, houses and apartment blocks; the list goes on and on.

Jason Wayne Lemue, a marine who served three tours in Iraq, said that anyone seen “carrying a shovel, or standing on a rooftop talking on a cell phone, or being out after curfew were to be killed. I can’t tell you how many people died because of this. By my third tour, we were told to just shoot people, and the officers would take care of us.” Another Iraq war veteran, Jason Moon, explained the thinking behind such orders: “If you kill a civilian he becomes an insurgent because you retroactively make that person a threat.”

The testimonials of these veterans proves that frequent murders and large massacres such as that documented by the 2006 Lancet survey’s data in Nineveh were par for the course under the increasingly fluid ‘rules of engagement’ US commanders gave their units. Those veterans also largely confirm that atrocities they witnessed or committed were not recorded or reported by the media, Iraqi or US authorities.

Dirk Adriaensens of the Brussels-based Betrand Russell Tribunal also documents several clear examples of deaths completely missed by the media and IBC database. He points out that the Tribunal has an incomplete list of 448 murdered Iraqi academics in its database, compiled from many different sources. IBC has 108 academics in its database, and lists just 24% of those reported by the Tribunal.

Spagat seems surprisingly, and reprehensibly, clueless about the state of the media in Iraq after 2003. In his paper, he argues that comparisons to the severe lack of media coverage of the largest Guatemala massacres in the countryside are inapplicable to Iraq:

“IBC incorporates news wires, many non-mainstream news sources and official figures like those of the Baghdad morgue and the Ministry of Health. Moreover, Iraq at this point in time is far more in the media spotlight than Guatemala was in the late 1970’s and early 1980’s and modern technologies like the internet and cell phones carry information much more freely out of Iraq in the 21st century than was the case in Guatemala nearly 30 years ago.”

This is a truly bizarre and misleading statement. It is widely known that it was unofficial Pentagon policy to target critical journalists in Iraq who were not embedded with military units. An analysis of British press reporting on the war by a team at Manchester University in 2003 found that less 10% of stories made any mention regarding civilian casualties.

According to one Iraqi journalist, Bassam Sebti, in 2006, clashes in the Sunni neighbourhood of Adhamiya “were so heavy that no one dared to go in there. In addition, if insurgents discover I am a journalist working for Western media, they may kidnap me and kill me.” He noted the “dwindling presence” of the foreign press in Iraq as the violence escalated. Another Iraqi journalist, Salah Hassan, explained that while Iraq’s media was not independent, Western journalists were increasingly under threat:

“Iraq media are not good media because we are under occupation and the occupation controls the media. Because occupation killed many journalists. They want Iraq to be empty of Western journalists, to destroy Iraq. In Fallujah and other cities they did many crimes freely because there were no journalists there.”

The Committee to Protect Journalists, Fairness In Accuracy in Reporting, Reporters Without Borders, and many other press monitoring bodies have documented the increasing risks to Western, Iraqi and Arab journalists in Iraq since 2003, their targeting by the US military and Iraqi government, and the decline in press freedom inside Iraq. Under Saddam, Iraq ranked a dismal 130 in the Reporters Without Borders press freedom index. By 2006, this had plummeted to 154 under US occupation.

In this context, it is highly likely that the media missed the vast bulk of the violence in Iraq committed by US occupation forces, and the US-backed Iraqi government.

The Lancet survey still stands

Spagat concludes that the 2006 Lancet survey “cannot be considered a reliable contribution to knowledge about mortality during the Iraq War.” This brief review of Spagat’s paper demonstrates the opposite: that the bulk of Spagat’s substantive lines of attack against the survey are deeply flawed.

Spagat raises some valid questions about the Lancet authors’ public descriptions of the exact sampling method, but apart from that, there is very little else that is original or substantive. It is difficult to understand how such a weak, frivolous paper was able to pass peer-review, and it is easy to understand why the Lancet authors saw no point in responding to it.

A more balanced perspective on Iraqi mortality estimates was published in Conflict and Health in March 2008 by a team of seven scientists. MIT’s John Tirman points out that this was the “most authoritative review of all the mortality estimates.”

The review, which examined thirteen mortality estimates, concluded that population-based epidemiological surveys are superior to those based on passive surveillance. The latter can help provide important detail, context and insight, but alone they are no substitute for the former. They also found that:

“… of the population-based studies, the Roberts and Burnham studies [in the Lancet] provided the most rigorous methodology as their primary outcome was mortality. Their methodology is similar to the consensus methods of the SMART initiative, a series of methodological recommendations for conducting research in humanitarian emergencies.”

The new 2013 PLOS Medicine survey corroborated the 2006 Lancet survey’s higher estimates, concluding that 461,000 excess deaths had occurred in Iraq from 2003 to 2011, two-thirds of which could be attributed to violence. The PLOS authors clarified that they believe this was a conservative underestimate.

A co-author of that study, Prof. Tim Takaro of Simon Fraser University’s Faculty of Health Sciences, was also a co-author of the new PSR report, which argued that the most accurate figure was probably between the 2006 Lancet’s estimate (for 2003–2006) and the 2013 PLOS estimate (for 2003–2011). The Lancet estimate, extrapolated up to 2015, imply nearly 1.5 million violent deaths.

Takaro et. al thus concluded that the true death toll from armed violence would lie somewhere between that higher end, and the more conservative PLOS findings. Incorporating excess deaths from indirect impacts of the war back into this estimate would lead to an overall estimate of just over a million Iraqi deaths, which the PSR authors stress itself remains a conservative figure.

Prof. Takaro was also among the seven scientists who had published the systematic review of Iraq mortality studies in Conflict and Health, finding the Lancet surveys to be the best. He is also familiar with Spagat’s work including the latter’s 2010 critique, found them unconvincing, and had critiqued Spagat and IBC rejections of the Lancet surveys in his PSR report.

Spagat’s paper, like much of his previous conflict analysis work, is not just fundamentally unethical and politically compromised, but repeatedly rigs, manufactures and manipulates data to reach his desired objective of dismissing the Lancet survey with finality. His verdict that the 2006 Lancet survey makes no reliable or valid contribution to knowledge about the Iraqi War death toll is not sustained.

Spagat’s modus operandi is to begin from highly questionable (and usually quite ignorant) assumptions about what ‘ought’ to happen in a conflict zone, and then to generate speculative statistical artifacts of improbability to prove high chances of falsification or fabrication.

Throughout, these arguments demonstrate a degree of willful dishonesty, and worse, fraudulent distortion and misrepresentation. There can be no doubt, as the doctors group PSR has conceded, that there are legitimate criticisms of the 2006 Lancet survey, and that scrutiny of the survey’s design and methodology is a welcome path to improving knowledge.

But what we see here, instead, is simply a transparent campaign to discredit the survey’s alarming findings and remove them from the debate over the Iraq death toll. Given the institutional backing behind Spagat, Sloboda, Dardagan, and others, this is not being pursued as a matter of impartial scientific inquiry, but is clearly politically-motivated to serve the agenda of the pro-war Western foreign policy establishment.

This investigation confirms that Spagat and the IBC are part of a pseudoscientific campaign financed by the US and Western governments, that is undermining confidence in epidemiological surveys, and discrediting higher death toll estimates of US-led wars in Iraq, Afghanistan, Colombia, and beyond.

In this case, Spagat has consistently concealed from the journals he publishes in his fundamental conflicts of interest due to longstanding ties to the US government, which funds the very conflict databases that he uses throughout his recent ‘scholarship.’

Given the poor nature of Spagat’s arguments, it is likely that such conflicts of interest played a key role in enabling his Defence and Peace Economics article to pass peer-review.

Embedded offense economics

A close analysis of the Defence and Peace Economics journal reveals the ideological and institutional influence of both the MoD and the Pentagon.

Far from being an independent, level-headed academic journal, Defence and Peace Economics is a pro-military publication that is ideologically slanted toward promoting and defending US global hegemony.

In a brief introduction celebrating the 20th anniversary of the journal in 2010, Defence and Peace Economics editorial board member Prof. Martin McGuire proclaimed in his round-up of the journal’s past coverage:

“Objective observers, I believe, would generally agree that American military hegemony has been a net benefit to the planet over the past six decades. There has been no Great Power War. In particular, those who have escaped the misfortune of being perceived a threat to US interests and whims have prospered.”

McGuire himself is a longtime US government consultant, who has in particular provided his services to the US State Department and US Department of Defense, including the Office of the Secretary of Defense.

The editor-in-chief responsible for receiving and publishing Michael Spagat’s paper on The Lancet Iraq mortality survey was Prof. Daniel Arce from the University of Texas, Dallas. He had recently received a grant, along with another journal editor Todd Sandler, from the US Department of Homeland Security for a project on ‘Terrorist Spectaculars: Backlash Attacks and the Focus of Intelligence.’

One of the outputs from this US government grant was a paper published in Defence and Peace Economics applying game theory — which is really only as accurate as its assumptions — to domestic counter-terrorism policy. As political scientist Dr. Brandon Tozzo of Trent University observes, game theory has become part and parcel of the government-business co-optation of academic international relations to produce pseudoscientific ‘scholarship’ designed to support expansionist American foreign policies.

Unsurprisingly, then, Arce arrived at the incoherent Orwellian conclusion that “counterterrorism policies have the potential to generate positive public support for terrorism via a backlash that may fuel terror recruitment.” Due to this potential for backlash, Arce recommended that governments targeted by terrorism continue such policies but also “fight an information war to change public opinion regarding its own policies and the ultimate effect of terror attacks.”

Prof. Arce is no longer chief editor at Defence and Peace Economics although he remains on the editorial board.

The journal’s founding editor and special editorial advisor is Prof. Keith Hartley of York University, who is also a longtime British Ministry of Defence (MoD) consultant. From 1985 to 2001, Hartley was special advisor to the House of Commons Defence Committee. He currently chairs the finance group of the UK Department of Trade and Industry’s Aerospace Innovation and Growth Team.

During the period after the 2003 Iraq War, Hartley was on the Project Board of the Defence Analytical Services Agency’s (DASA) Quality Review of Defence Finance & Economic Statistics. DASA provides statistical services to the MoD, and its chief executive is also MoD’s head of statistics.

The MoD privately recognized the 2006 Lancet study’s design as “robust” and its methods “close to best practice,” but officially the British government rejected its findings. In an email exchange between Foreign Office officials obtained by the BBC, one asked: “Are we really sure the report is likely to be right? That is certainly what the brief implies.” The other official replied: “We do not accept the figures quoted in the Lancet survey as accurate,” despite the “survey methodology” being “a tried and test way of measuring mortality in conflict zones.”

The British government’s official reason, though, as explained in a public statement, was that “… the Lancet figures are much higher than statistics from other sources” — like IBC, for instance.

Spagat’s paper did its best to do what MoD officials at the time could not: rubbish Lancet 2006's design and methodology.

Career MoD consultant Prof. Hartley believed strongly that the costs of the Iraq War outweighed its benefits. But he was also biased toward a heavily reduced casualty figure derived from the Iraq Body Count.

In 2005 presentation slides on the costs of the war, Hartley cited only two estimates of the Iraqi civilian death toll: 12,800–14,000 and 10,000–33,000 civilians. The last figure was published in the Journal of the Royal United Services Institute (RUSI), a Whitehall think-tank that operates in close alignment with MoD policy.

But Hartley’s first figure was reported by IBC as of 22nd September 2004. Hartley even rounded down IBC’s figure of 14,843 recorded deaths to 14,000.

Hartley selected these low figures for Iraqi deaths despite the fact that The Lancet had the preceding year published the first major peer-reviewed epidemiological study estimating the total ‘excess death’ toll from the war, conservatively, at around 100,000 people.

Finally, in the acknowledgements to his Defence and Peace Economics paper, among the people Spagat thanks is Colin Kahl, Obama’s then defense secretary for the Middle East and incumbent National Security Adviser to Joe Biden — indicating either that the paper was reviewed at the highest levels of the Pentagon, or that Spagat consulted at that level, before submission and publication of his Lancet critique.

If ever there was a journal whose editors and peer-review network would lean ideologically toward publishing a fraudulent paper critical of The Lancet’s 2006 estimate of 655,000 excess Iraqi deaths due to the war, it was Defence and Peace Economics.

Conclusions

Spagat demands “a formal investigation of the second Lancet survey of mortality in Iraq,” but in reality the evidence gathered here shows the urgency of a formal investigation into the history, scholarship, interests and connections of IBC, its affiliated organisations and researchers, and its capacity to garner legitimacy by publishing in scientific journals.

Rather than being the product of genuine, independent academic inquiry, this investigation confirms that the IBC’s output, often with support from leading academic institutions, has largely been performed under the financial and organizational influence of the very same powerful vested interests that have fostered armed conflict in Iraq, Afghanistan, Colombia and beyond.

Many of the papers published in peer-reviewed journals by IBC-affiliated researchers should be investigated at the very least due to their unstated conflicts of interests, but also to determine the integrity of their cited data and associated sweeping statistical claims.

In particular, questions must be asked as to how and why elements of the scientific community have irresponsibly allowed statistically-fraudulent claims about conflict trends derived from convenience samples to be published in serious journals.

There is a glimpse of an answer to such questions, and part of it is to do with the present limitations of modern scientific inquiry — increasing disciplinary specialization, and conversely, decreasing capacity for interdisciplinary integration of knowledge.

Conflict dynamics are complex because they are not driven by one factor, but many. Number-crunching a selective database of media reports of violence without having any significant knowledge of some dimensions of the sociology, history, economics, environment, politics, ideology, and culture of a conflict cannot create meaningful knowledge about that conflict.

Yet it is not just the positivist fetishization of numbers that accounts for why journal editors might assume the IBC’s endless number-crunching to be evidence of real scientific inquiry. It is also a fundamental disparity in power between Western institutions of knowledge, and the regions where such conflicts unfold. Journal editors often know little about these conflicts. What they know largely comes from the very same media that IBC relies on to create its inherently selective database of violent deaths.

It is only such fundamental ignorance that permits truly absurd analyses of these conflicts to be seen as offering scientific insights. The façade of number-crunching, complete with graphs and diagrams, allows us to feel that we are analysing these conflicts scientifically and understanding their dynamics. The findings vindicate pre-established policies of repression and corruption, while permitting us to feel positive about our valuable contribution to science and peace.

In reality, as such number-crunching is only as good as its real-world assumptions, the occlusion of that ‘real-world’ from the equation means we are merely developing increasingly sophisticated numerical narratives that permit us to continue business-as-usual.

The IBC’s work of monitoring as many casualties as it can identify via open sources is indisputably an important and valuable endeavour. That value, however, has become muted by attempts to abuse this work by subordinating it to the interests of warmongers, employing deceptive statistical techniques, making unsustainable exaggerated claims, and pursuing witch-hunt like attacks on standard epidemiological research techniques.

The abuse of science to legitimize war and sanitize death is not unprecedented. About half a century ago, the Nazis did so with much success. Yet they did not do so in isolation. Often, the work of Nazi scientists found disturbing parallels with wider scientific trends at the time, particularly in the form of racist biological and evolutionary theories bound up with eugenics.

This investigation shows that the subversion of science by powerful interests to legitimize violence continues.

The best epidemiological surveys suggest there have been around a million Iraqi deaths to date, due to the 2003 war. The campaign to discredit these findings in the name of science and peace has been co-opted from behind-the-scenes by the very institutions of power complicit in those deaths.

**This article was amended and updated on 5th May and 3rd June 2015 to account for a response from Dr. Patrick Ball regarding USIP’s role in funding IBC researchers; to account for a response in the Washington Post by statistician Dr. Andrew Gelman in which he stated that he felt his views about Spagat’s criticisms of the 2006 Lancet study had not been made sufficiently clear; and to correct an error. The first version of this article incorrectly stated that Ball’s HRDAG is based in Los Angeles. This has been corrected to San Francisco. The 5th May update to the article erroneously referred to USIP’s selection of IBC for funding. On 3rd June 2015, this was corrected to acknowledge that it was an IBC affiliated organisation (Every Casualty)**