Recently Scott Greenfield over at Simple Justice sent me a tweet asking my thoughts on Dara Lind’s article on false rape allegations. After taking a quick read through, I was fairly dismissive of it. Much to her credit, upon seeing the twitter exchange between myself and Scott, Dara immediately corrected the error in her piece. She also made herself available for any questions I had while writing this post, so you will see some her responses included where appropriate . While I disagree with some of the conclusions she came to in her article, I was really impressed with the way she conducted herself and she has certainly earned my respect.

So what is the main conclusion I disagree with?

For one thing, research has finally nailed down a consistent range for how many reports of rape are false: somewhere between 2 and 8 percent, which is a lot narrower than the 1.5 percent to 90 percent range of the past.

My first problem with this is that there aren’t any US studies that I am aware of that actually use the 2-8% range. The only place I’ve seen that range used is in the The Voice article which, as I’ve previously discussed, isn’t exactly peer-reviewed research. Even Lisak, who is a listed author of The Voice article says the range is wider at 2-10%. I asked Lind about this and here is her response:

Q2: In the article you state “For one thing, research has finally nailed down a consistent range for how many reports of rape are false: somewhere between 2 and 8 percent” and have a section heading of “A growing consensus: between 2 and 8 percent of allegations.” In your research did you find other authors coming up with that range besides Lonsway herself or when referencing the 2009 The Voice article? A: To answer questions 2 and 5 : I almost certainly relied too much on Lonsway, between the interview I conducted with her and her response to Lisak.

I also asked about how heavily she weighed the relative importance of the various studies she researched:

Q4: When arriving at your conclusions, how heavily did you weigh recency and location (or perhaps the better way to phrase – how much credence did you give to studies done outside the US or more than 20 years ago)? A:To answer question 4: Strong bias for recency, little if any bias for US-based (I was working off Rumney, after all) but some. I did try to make the paucity of recent US-based studies clear in the article.

Here is another area where I disagree with her – I don’t see why there would be some sort of universal rate of false reporting worldwide, so the international studies aren’t particularly meaningful to me . Once you strip out the international studies and the studies over 30 years old, all we are really left with is the MAD study and the Lisak study. In previous posts I’ve detailed many of the problems with the MAD study, but here are some of the highlights:

Study was conducted by End Violence Against Women International, an organization that can hardly claim to be unbiased in regard to the prevalence of false rape reports

Prior to the study Joanne Archambault, the executive director of End Violence Against Women International, expressed her opinion that the real rate of false reporting was 4%

The communities studied were not a random sample, but rather had to apply for the study and were then chosen by a selection committee

Despite the data collection period being from 2005-2006, the study results have yet to be published in any peer-reviewed journal

Reports could only be classified as false after a “thorough, evidence-based investigation.” However, such an investigation isn’t really possible if you follow EVAW International’s training materials which discourage asking too many questions for fear of receiving inconsistent responses and suggested stopping the investigation if things seemed off

What about the Lisak study? Lind correctly describes it as “inescapably narrow,” though in my mind that isn’t the only problem with it. One of Lisak’s primary complaints about the Kanin study was that it “violates a cardinal rule of science, a rule designed to ensure that observations are not simply the reflection of the bias of the observer.” With that in mind, let’s take a look at the team of researchers Lisak selected to help him categorize reports (emphasis mine):

Lori Gardinier, MSW, PhD, is the program director for the Human Services Major at Northeastern University in Boston, Massachusetts where she is also the founder of the Campus Center on Violence Against Women. She holds a master’s degree in social work from Boston University and a PhD from Northeastern University. She has practiced in the area of antipoverty/ social justice work in community-based settings and as a counselor in organizations addressing intimate partner violence. Her most recent publication examines the Paid Family Leave Campaign in Massachusetts as a social movement. Sarah C. Nicksa, MA, is a PhD candidate in the sociology program at Northeastern University. Her dissertation, entitled “Bystander Reactions to Witnessing Sexual Assault: The Impact of Gender, Community, and Social Learning,” will be completed in spring 2011. She regularly teaches “Violence in the Family” and is a medical advocate at the Boston Area Rape Crisis Center. Ashley M. Cote is a graduate of Northeastern University’s College of Criminal Justice, where she focused on juvenile justice, security, and criminology. At Northeastern University, she was a member of the Campus Center on Violence Against Women, studied the effects of parental attachment on youth violence, and was elected a gubernatorial advisor for the Juvenile Justice Advisory Committee under the Massachusetts Executive Office of Public Safety and Security. She is currently employed at the Massachusetts General Hospital’s police, security, and outside services department and plans to earn a master’s in social work and urban leadership

Essentially a team of victim advocates, who are trained in the mantra of “Always Believe,” sorted reports into the categories they felt were appropriate. If Lisak was trying to ensure that his observations were not simply the reflection of the bias of the observer, I think he could have done a better job. For the sake of argument though, let’s say that none of the above concerns had an impact on the studies. Even if that were case, I would still disagree with the conclusion that the range of false reporting is between 2-8%. The real problem is what question these studies are actually answering. Here are just a few examples of what people think these studies say :

RE: RollingStone’s statement on UVA story. Best research shows 2-10% rape allegations false that means 90-98% real Instead of saying 2-8% of rape reports are false to weigh in on the issue, why not say at least 92-98% are true?? She says “a fair #” of women lie about rape. No. B/w 2-8% of rape reports turn out to be false. Meaning out of 100 women, 92-98 are truthful

People think the question these studies answer is “What percentage of rape reports are false?” In reality, the question they are really answering is “What percentage of rape reports can we classify as false with a high degree of certainty?” As a result, these studies don’t give us binary outcomes. This isn’t necessarily a result of flawed design studies either. Statistics isn’t a magical art form capable of diving absolute truth from thin air. When it comes to sexual assault, unless you physically were able to witness what happened, it can be very difficult to classify a report as either true or false. In the later section of this post I detailed how the data in the MAD study classifies 7.1% as false, but depending on which assumptions you use it could also classify only 1.2% – 7.8% as “true.” This means that in the vast majority of cases, we don’t really have a way of determining if they are true or false.

Let’s take a look at the outcomes of the Lisak study next. First up, we have “False Reports” at 5.9%:

Applying IACP guidelines, a case was classified as a false report if there was evidence that a thorough investigation was pursued and that the investigation had yielded evidence that the reported sexual assault had in fact not occurred. A thorough investigation would involve, potentially, multiple interviews of the alleged perpetrator, the victim, and other witnesses, and where applicable, the collection of other forensic evidence (e.g., medical records, security camera records). For example, if key elements of a victim’s account of an assault were internally inconsistent and directly contradicted by multiple witnesses and if the victim then altered those key elements of his or her account, investigators might conclude that the report was false. That conclusion would have been based not on a single interview, or on intuitions about the credibility of the victim, but on a “preponderance” of evidence gathered over the course of a thorough investigation

That makes for a pretty high bar to clear. In how many reports does it seem likely you would be able to show 1) key elements of a victim’s account of an assault were internally inconsistent 2) those elements were directly contradicted by multiple witnesses, and 3) the victim then altered those key elements of his or her account? At this point I’ll refer you back to the tweets I listed above. It is one thing to have a personal policy of assuming a report is true unless there is conclusive proof to the contrary, but these individuals are trying to claim that all of the below categories must be true.

Next up is “Case did not proceed” which was the most used classification at 44.9%:

This classification was applied if the report of a sexual assault did not result in a referral for prosecution or disciplinary action because of insufficient evidence or because the victim withdrew from the process or was unable to identify the perpetrator or because the victim mislabeled the incident (e.g., gave a truthful account of the incident, but the incident did not meet the legal elements of the crime of sexual assault).

In other words, the most frequent classification in the study is unable to be slotted into the true/false binary.

Next is “Insufficient information to assign a category” at 13.9%:

This classification was applied if a report lacked basic information (e.g., neither the victim nor the perpetrator was identified, and there was insufficient information to assign a category).

Here is a fun math trick. If you would like the headline number of your study to be smaller, be sure to fluff your denominator with cases that lack enough basic information to properly categorize.

Finally, we have “Case proceeded” at 35.3%:

This classification was applied if, after an investigation, the report resulted in a referral for prosecution or disciplinary action or some other administrative action by the university (e.g., the victim elected not to pursue university sanctions, but the alleged perpetrator was barred from a particular building).

It may be tempting to view this category as “true,” but compare these criteria to the extremely strict definition of a false report. Let’s flip the scenario here. If instead of a team of victim advocates, the researchers consisted of defense attorneys or due process advocates and applied a similarly strict set of criteria to the “case proceeded” bucket, I’d have to imagine the “true report” category would be quite small as well. If someone did such a study and used it to claim that 90%+ of rape reports were false, it would be regarded as hogwash – and rightly so.

Lind’s point of view was that “research has finally nailed down a consistent range for how many reports of rape are false” and “The question is whether this research is going to get acknowledged, or if false accusations are going to continue to be treated as an unknowable X-factor in rape cases.” I would argue that the real question is, if the only research we have isn’t able to classify the majority of cases as either true or false, how can we possibly claim to know the actual frequency of either?

The issue of the 2-8% range was the area I primarily disagreed with, but there were two other things I wanted to touch on quickly.

First, in her article Lind discusses what the research shows about why people file false reports. My issue here is that once again we have to be careful about what we think we know vs what the research is actually able to tell us. Since we are only able to examine the motivations for false reporting on cases we can determine to be false, instead of answering “Why do people file false rape reports?” they may only be answering “Why do the reports we are able to classify as false get filed?” Lind claims that “Revenge wasn’t a very common motivation. And regret or guilt — the motivation the “gray rape” narrative implies is most common — wasn’t much of a factor at all.” This may well turn out to be the case, however we have to consider the possibility that reports filed because of revenge or guilt may just be harder to classify as false, and thus the available research might not be capturing them.

Second, in my tweets to Scott, I questioned whether or not Lind had read the relevant underlying studies. In my email to her I explained my rationale for that. Here are the relevant portions of my email and her response:

1 – Basis for my tweets The first thing that made me think you hadn’t fully read the source material was obviously the mixing up of the MAD study with the Lisak study. However, it was primarily your treatment of the Kanin study that ultimately led me to that hypothesis. There were three specific places in your article that made it seem more like your exposure to the Kanin study was through Lisak’s critiques rather than reading the study in its entirety. First, when you bring up the study you quickly dismiss it because “But the department asked anyone claiming to have been raped to take a polygraph test to prove it.” For starters, I’m not sure that is 100% consistent with how it is described in the methodology section though: “The investigation of all rape complaints always involves a serious offer to polygraph the complainants and the suspects.” That is just a minor quibble though. The more important issue for me is that Kanin specifically addressed the concern about false confessions: “Several responses are possible to this type of criticism. First, with very few exceptions, these complainants were suspect at the time of the complaint or within a day or two after charging. These recantations did not follow prolonged periods of investigation and interrogation that would constitute anything approximating a second assault. Second, not one of the detectives believed that an incident of false recantation had occurred. They argued, rather convincingly, that in those cases where a suspect was identified and interrogated, the facts of the recantation dovetailed with the suspect’s own defense. Last, the policy of this police agency is to apply a statute regarding the false reporting of a felony. After the recant, the complainant is informed that she will be charged with filing a false complaint, punishable by a substantial fine and a jail sentence. In no case, has an effort been made on the part of the complainant to retract the recantation. Although we certainly do not deny the possibility of false recantations, no evidence supports such an interpretation for these cases.” I cover this in more detail in my posts on the topic, but fact that no one claimed they just confessed to get process over with after being told that they will be charged with filing a false complaint seems to be a pretty key element here. Second, if your primary issue with the study is the polygraph, why no mention of the additional data in the addenda that found even high false reporting rates at colleges and did not involve the use of the polygraph? Finally, your discussion of the motivations for false reporting completely excludes the findings in the Kanin study. Specifically, your claim that “Revenge wasn’t a very common motivation” seems at odds with his finding it present in 27% of the cases in the original study, and 44% in the addenda. I should point out here that I think there are some serious issues in the Kanin study – just not the ones that Lisak brings up. 2 – Research questions On the topic of reading all the underlying studies, you use a LOT of studies in your article. A quick count shows well over 1,300 pages and that doesn’t even get them all [Note: in the original email I included a listing of page counts here] Q1 I’m assuming you didn’t read all of those in their entirety, so which ones did you read, which were more skimming and which were more based on the abstracts? A: One thing to note is that I actually drafted this article in December. Both the editor and I have had a terrible time getting edits back to each other. That has two implications: 1) neither post in your series on this was up when I researched and drafted the piece; 2) my memory of what exactly I tracked down is not as good as it would be with a more recently researched article. (Yes, this means that I ran a piece that could be/could have been outdated. Again, I own that.) To answer the preliminary question and question 1: Your instinct is right but your hypothesis is wrong. I did start from a lit review and work backwards from there (and did not read every study I mentioned). The lit review I used, though, wasn’t Lisak–I skimmed that later–but Rumney 2006. I think that, as a general rule, it’s acceptable for journalists to use lit reviews in this fashion. And because I started with the academic lit review before talking to Lonsway, etc., I wasn’t as concerned with accounting for author bias — it’s not that I think that academics don’t have bias, but a) I think it’s acceptable to treat academic work as legitimate until shown otherwise and b) in the absence of other information, trying to account for academic bias leads to a lot of caveats along the lines of “but this might not be the case, we really don’t know.” Which become repetitive and get edited out. As for Kanin in particular: I know I tracked down the cite as hard as I could, precisely because I wanted to check Rumney’s characterization of the polygraph test. I can’t remember if I actually got to the text or not (we lack access to the major academic databases — which is another reason why leaning on lit reviews happens).

Dara was unable to answer any questions that dealt with the editing process, so there is no way to know how big of a factor that might have been. Any work of professional journalism inevitably involves removals or changes due to length, flow, etc. This may help to explain why the article neglected to mention certain elements that I would like to have seen included. After reading through her article in more detail, I no longer think this was a matter of her not doing proper research, though we have differing opinions on what conclusions to draw from the underlying data.

To end this post on an optimistic note – our exchange does seem to be an example of a twitter conversation leading to productive discussion instead of just devolving into a diarrhetic stream of insults. It may not happen often, but it is good to know that it is possible.