Yesterday I looked into quantitatively measuring the rumor I’ve been hearing for years, namely that charter schools cherrypick students – get rid of troublesome ones, keep well-behaved ones, and so on.

Here are two pieces of anecdotal evidence. There was a “Got To Go” list of students at one charter school in the Success Academy network. These were troublesome kids that the school was pushing out.

Also, I recently learned that Success Academy doesn’t accept new kids after the fourth grade. Their reasoning is that older kids wouldn’t be able to catch up with the rest of the kids, but on the other hand it also means that kids kicked out of one school will never land there. This is another form of selection.

Now that I’ve said my two examples I realize they both come from Success Academy. There really aren’t that many of them, as you can see on this map, but they are a politically potent force in the charter school movement.

Also, to be clear, I am not against charter schools as a concept. I love the idea of experimentation, and to the extent that charter schools perform experiments that can inform how public schools run, that’s interesting and worthwhile.

Anyhoo, let’s get to the analysis. I got my data from this DOE website, down at the bottom where I clicked “citywide results” and grabbed the following excel file:

With that data, I built an iPython Notebook which is on github here so you can take a look, reproduce my results with the above data (I removed the first line after turning it in to a csv file), or do more.

From talking to friends of mine who run NYC schools, I learned of two proxies for difficult students. One is ‘Percent Students with Disabilities’ and the other is ‘Percent English Language Learners’ (I also learned that charter schools’ DBN code starts with 84). Equipped with that information, I was able to build the following histograms:

I also computed statistics which you can look at on the iPython notebook. Finally, I put it all together with a single scatterplot:

The blue dots to the left and all the way down on the x-axis are mostly test schools and “screened” schools, which are actually constructed to cherrypick their students.

The main conclusion of this analysis is to say that, generally speaking, charter schools don’t have as many kids with disabilities or poor language skills, and so when we compare their performance to non-charter schools, we need to somehow take this into account.

A final caveat: we can see just by looking at the above scatter plot that there are plenty of charter schools that are well inside the middle of the blue cloud. So this is not a indictment on any specific charter school, but rather a statistical statement about wanting to compare apples to apples.

Update: I’ve now added t-tests to test the hypothesis that this data comes from the same distribution. The answer is no.