Submitted on May 2, 2011

** Also posted here on “Valerie Strauss’ Answer Sheet” in the Washington Post

Most people involved in education policy know exactly what you mean when you refer to “the CREDO study." I can’t prove this, but suspect it may be the most frequently mentioned research report over the past two years (it was released in 2009).

For those who haven’t heard of it (or have forgotten), this report, done by the Center for Research on Education Outcomes (CREDO), which is based at Stanford University, was a comparison of charter schools and regular public schools in 15 states and the District of Columbia. Put simply, the researchers matched up real charter school students with fictional amalgamations of statistically similar students in the same area (the CREDO team called them “virtual twins”), and compared charter school students’ performance (in terms of test score gains) to that of their “twins." The “take home” finding – the one that everybody talks about – was that, among the charter schools included in the analysis, 17 percent had students who did better on the whole than their public school twins, in 37 percent they did worse, and in 46 percent there was no statistical difference. Results varied a bit by student subgroup and over time.

There are good reasons why this analysis is mentioned so often. For one thing, it remains the largest study of charter school performance to date, and despite some criticism that the "matching" technique biased charter effects downward, it was also a well done large-scale study (for a few other good multi-state charter studies, see here, here, and here). Nevertheless, as is so often the case, the manner in which its findings are discussed and understood sometimes reflect a few key errors of interpretation. Given that it still gets attention in major media outlets, as well as the fact that the CREDO team continues to release new state-specific reports (the latest one is from Pennsylvania), it makes sense to quickly clear up three of the most common misinterpretations.

The first issue is one I have discussed before, and will reiterate here: The fact that students in a given charter school or group of charters performed “significantly better” or “significantly worse” than their counterparts in regular public schools does not mean that these differences were large, or even moderately large. If a charter school performed “significantly” better or worse than its virtual regular public school “twin," this only means that the difference was highly unlikely to be zero – i.e., it cannot be chalked up to random fluctuation.

In fact, it’s a safe bet that the vast majority of the “significant” differences between charters and regular public schools were actually rather tiny. Many (if not most) of them were probably so small as to be meaningless. So, when you hear someone say that 17 percent of charters are excellent (as did the narrator in Waiting for Superman), or that 36 percent of them are awful, keep in mind that this is a massive misinterpretation of the findings. I often wonder how much different our education debate (and that in other areas, as well) would be if we used the term “statistically discernible” instead of “statistically significant."

The second clarification I’d like to offer is highly related: Even if there was a difference between a given charter school and its “virtual twin” that is both statistically significant and large – whether positive or negative – it does not necessarily follow that the school is “good” or “bad." The CREDO report assesses schools with regression models that attempt to isolate the difference in performance between charters and regular public schools, while controlling for other potentially relevant factors (e.g., student characteristics). Any estimated difference between the two types of schools is, of course, a relative – not an absolute – performance measure.

So, for example, let’s say the students in “Charter School X” did much better than their “virtual twins” in regular public schools – the difference was huge and significant. This doesn’t mean that Charter X is a great school. Actually, Charter X might be a low-performing school, which just happened to do better than the schools to which it was compared, which were even worse. Conversely, Charter X and its comparison regular publics might all be wonderful schools, but Charter X just happened to be the best of the very good.

Just as the CREDO report – or at least the results it provides - does not permit any blanket statements about how much better are the higher-performing regular public schools vis-à-vis charters (or vice-versa), it also cannot be used to make absolute statements about “good” or “bad” schools (especially since so many of the differences are probably quite small). It’s true that, across a large group of charters (e.g., a whole state or bunch of states), one would expect that these differences would be “smoothed out” a bit – i.e., there would be enough schools in the sample, both good and bad, to reflect some kind of average level of school “quality." But any generalizations would have to be cautious – the results are all relative, and it’s not quite appropriate to use them in this fashion.

Remember that the next time you hear somebody cite CREDO and say that some charters did a fantastic job of boosting the scores of low-income students (the report breaks down the effects by subgroups), or that 37 percent of public schools are great school. Neither is not necessarily true. At least according to the results of the analysis, they did a better job than the schools to which they are matched, but whether or not this was a great job by some absolute standard is a separate question (albeit one that is difficult to assess).

The third and final issue is less of a clarification than a comment, and it pertains to the conclusions drawn from the report by both supporters and opponents of charters.

As mentioned above, the CREDO team’s “featured presentation” of their results consisted of what percentage of charters did better, did worse, and were not statistically different from regular public schools. In a fascinating reaction, both supporters and opponents of charter school seized on the results. Supporters noted that 17 percent of charters did better, which they said meant we had to close the bad schools and replicate the good charters. Opponents noted that, while only 17 percent of charters did better, 83 percent did no better or worse, which they said meant that the charter experiment was a failure and should be ended. As a result, somewhat strangely, the largest charter study ever only served to deepen divisions between the two camps.

I don’t have any issue with the CREDO team’s presentation of results – it is, I suppose, easy to understand and it illustrates the performance diversity among schools. Still, the “standard” presentation in this type of study – one that compares a certain type of policy with another type (in this case, charters and regular public school governance) – would be to present the overall effect (which was included in the report, the technical report, and offered in the first paragraph of the press release, but still got scant attention in mainstream information outlets).

For the record, the overall effect of charters was negative and statistically significant in both reading and math. The effect sizes, however, were trivial (.01 and .03 standard deviations, respectively). The proper interpretation on the whole is that there is very little difference among charter schools and regular public schools, at least in terms of their effect on student test scores (though aggregate estimates almost always mask underlying variation by subgroups - for instance, charters had a small positive effect among low-income and ELL students).

The only thing everyone agrees on is that charter school performance – relative to comparable regular public schools – varies widely. Of course it does. So does the effectiveness of charters compared with other charters, and regular public schools compared with other public schools. The performance of schools – at least as measured by their students’ test score gains – varies.

So, my point here is: The CREDO report does not support the expansion or selective closing of charter schools, any more than it supports converting them back into regular public schools. There are good and bad schools of both types. This is an unsatisfying conclusion – after all, how could such a huge study provide no guidance as to how we should proceed on the charter school front?

I would suggest, once again, that we aren’t getting good answers because we’re not asking the right questions. It’s not whether some charters seem to do better or worse, but rather why (CREDO does provide some state-level results addressing this - they're worth checking out, but very limited).

That is, how can we explain the performance of any “good” or “bad” school and, in so doing, hopefully identify specific policies and practices that can be used to improve all schools? Until we start focusing on that question, the charter school “debate” will continue to be trench warfare, in which even huge, well-done studies settle nothing.