U.S. student achievement looks more favorable on the global stage when comparisons take into account the especially large share of American adolescents who come from disadvantaged social backgrounds, concludes a study released today by the Stanford Graduate School of Education and the Economic Policy Institute. The gap, for instance, between U.S. students and those from top-scoring nations on one prominent global assessment would be cut in half in reading and by at least one-third in math, the study says, if statistical adjustments were made for social class.

In addition, the study finds that while the achievement of disadvantaged U.S. students has been "rising rapidly over time," test scores for such students in some nations to which the United States is frequently compared, such as Finland and South Korea, have been "falling rapidly."

The research, which draws on reading and math results spanning a decade or more on two high-profile international assessments, seeks to go beyond the average national test scores widely discussed and debated to better gauge how countries are educating particular groups of students, especially those who tend to face the biggest academic challenges.

"Education reformers frequently invoke the relatively poor performance of U.S. students to justify school policy changes," write co-authors Richard Rothstein of the Economic Policy Institute, a Washington think tank, and Martin Carnoy, an education professor at Stanford. But those conclusions draw on comparisons that are "oversimplified, frequently exaggerated, and misleading. They ignore the complexity of test results and may lead policymakers to pursue inappropriate and even harmful reforms."

Central to the new research is the premise that, in every country, students at the bottom of what the researchers call the "social-class distribution" perform worse, on average, than students higher in that distribution. And so, the U.S. average is brought down relative to some other nations with which the United States is frequently compared "because we have so many more test-takers from the bottom of the social-class distribution."

The study focuses on achievement in the United States and six other countries. They include three high-fliers on global comparisonsCanada, Finland, and South Koreaas well as three "similar post-industrial countries": France, Germany, and the United Kingdom.

The study comes as a fresh batch of international achievement data was issued in December on TIMSS, the Trends in International Mathematics and Science Study, and PIRLS, the Program for International Reading Literacy Study. Here's the EdWeek coverage on the new global data.

The authors say that "social-class inequality" is greater in the United States than in "any of the countries with which we can reasonably be compared." As a result, the relative performance of U.S. adolescents is "better than it appears when countries' national average performance is conventionally compared."

Before I proceed, I want to briefly explain how the authors examined social status. They did not draw on family income, race/ethnicity, or parent education level to distinguish social-class groups. Instead, they relied on the number of books in adolescents' homes. "We consider that children in different countries have similar social-class backgrounds if their homes have similar numbers of books," they explain. Although the authors concede that the measure may be imperfect, they contend that this "indicator of household literacy is plausibly relevant to student academic performance, and it has been used frequently for this purpose by social scientists." They ultimately divided the population into six social-class groups, from least to most advantaged.

"This is the first time I think anybody has done a cross-country comparison with social-class disaggregation," Rothstein told me.

On PISA, the Program for International Student Assessment, the analysis found the picture for U.S. 15-year-olds improved considerably when taking into account social class. As mentioned above, the gap with top-scoring nations, in this case Canada, Finland, and South Korea, closed by half in reading and about one-third in math. However, the data do not change the global picture altogether. At all points in the social-class distribution, U.S. students perform worse, and in many cases substantially worse, than students in Canada, Finland, and South Korea, the study says. (You can find EdWeek's overview of the 2009 findings here.)

"Although controlling for social-class distribution would narrow the difference in average scores between these countries and the United States, it would not eliminate it," the study says.

A table in the study illustrates the differences in social-class distribution across the seven countries on PISA in 2009. It finds that 20 percent of U.S. students were in the lowest social-class group, below all others. Here's the data for the rest: Canada (9 percent), Finland (6 percent), South Korea (5 percent), France (15 percent), Germany (12 percent), and the United Kingdom (14 percent).

The authors also identify what they call an "apparent flaw" in the 2009 U.S. PISA sampling methodology that "probably reduced the reported average score of students in the bottom social class." Rothstein told me that officials from the Organization for Economic Cooperation and Development, which oversees PISA, were consulted about this situation and did not dispute it. As Rothstein explained to me, 40 percent of the U.S. sample for PISA in 2009 was drawn from schools in which at least half of all students are eligible for a free or reduced-price lunch. In other words, the U.S. sample included what the report calls a "disproportionate number of disadvantaged students who were enrolled in schools with unusually large concentrations of such students." (That said, the U.S. sample did include disadvantaged students in "appropriate proportion to their actual representation.")

[UPDATE: (3:45 p.m.) Since I published this blog post, I heard from an official at the National Center for Education Statistics, who said there was no sampling error as described above on PISA, and who also disputed how Rothstein characterized the reaction from OECD to this matter. OECD officials "refuted Rothstein's mistake very strongly," said Daniel McGrath, the director of the international-activities program at the National Center for Education Statistics, in an email. McGrath said Rothstein and Carnoy did not use the right data sources in making their claim about the extent to which students in high-poverty public schools were tested. "They had the wrong year and they compared mismatched data sources," McGrath writes. "The PISA sample did not have a disproportionate number of high-poverty schools." McGrath said there are "myriad other problems with this paper," which are addressed in a six-page letter the OECD sent to the authors. Since McGrath contacted me, I also heard back from Rothstein, who shared this response to the OECD letter.]

The researchers paid special attention to educational achievement in Finland, a nation that has been heralded as a global leader in schooling for its strong performance on PISA. (As I recently discovered, however, Finland's results are not as strong on TIMSS in math. In fact, in that subject, the national average is not statistically different from the United States'.)

Although the study finds that math and reading scores in Finland are higher for every social-class group than in the United States on PISA, its scores have been falling for the most disadvantaged students while U.S. scores have been improving for students from a similar social background.

"This should lead to greater caution in applying presumed lessons from Finland," the study says.

There is a ton of data and analysis to mine in this new report, and my blog post only scratches the surface. I'll close by returning to the issue of methodology. In the end, the authors concede that some dimensions of their research strategy, with its focus on disadvantaged students as judged by their access to books at home, may be debated.

"Scholars and policymakers may choose different approaches," they write. "We are only certain of this: To make judgments only on the basis of statistically significant differences in national average scores, on only one test, at only one point in time, without regard to social-class context or curricular- or population-sampling methodologies, is the worst possible choice. But, unfortunately, this is how most policymakers and analysts approach the field."