This week marks the release of the 2015 Brown Center Report on American Education, the fourteenth issue of the series. One of the three studies in the report, “Girls, Boys, and Reading,” examines the gender gap in reading. Girls consistently outscore boys on reading assessments. They have for a long time. A 1942 study in Iowa discovered that girls were superior to boys on tests of reading comprehension, vocabulary, and basic language skills.[i] Girls have outscored boys on the National Assessment of Educational Progress (NAEP) reading assessments since the first NAEP was administered in 1971.

I hope you’ll read the full study—and the other studies in the report—but allow me to summarize the main findings of the gender gap study here.

Eight assessments generate valid estimates of U.S. national reading performance: the Main NAEP, given at three grades (fourth, eighth, and 12th grades); the NAEP Long Term Trend (NAEP-LTT), given at three ages (ages nine, 13, and 17); the Progress in International Reading Literacy Study (PIRLS), an international assessment given at fourth grade; and the Program for International Student Assessment (PISA), an international assessment given to 15-year-olds. Females outscore males on the most recent administration of all eight tests. And the gaps are statistically significant. Expressed in standard deviation units, they range from 0.13 on the NAEP-LTT at age nine to 0.34 on the PISA at age 15.

The gaps are shrinking. At age nine, the gap on the NAEP-LTT declined from 13 scale score points in 1971 to five points in 2012. During the same time period, the gap at age 13 shrank from 11 points to eight points, and at age 17, from 12 points to eight points. Only the decline at age nine is statistically significant, but at ages 13 and 17, declines since the gaps peaked in the 1990s are also statistically significant. At all three ages, gaps are shrinking because of males making larger gains on NAEP than females. In 2012, seventeen-year-old females scored the same on the NAEP reading test as they did in 1971. Otherwise, males and females of all ages registered gains on the NAEP reading test from 1971-2012, with males’ gains outpacing those of females.

The gap is worldwide. On the 2012 PISA, 15-year-old females outperformed males in all sixty-five participating countries. Surprisingly, Finland, a nation known for both equity and excellence because of its performance on PISA, evidenced the widest gap. Girls scored 556 and boys scored 494, producing an astonishing gap of 62 points (about 0.66 standard deviations—or more than one and a half years of schooling). Finland also had one of the world’s largest gender gaps on the 2000 PISA, and since then it has widened. Both girls’ and boys’ reading scores declined, but boys’ declined more (26 points vs. 16 points). To put the 2012 scores in perspective, consider that the OECD average on the reading test is 496. Finland’s strong showing on PISA is completely dependent on the superior performance of its young women.

The gap seems to disappear by adulthood. Tests of adult reading ability show no U.S. gender gap in reading by 25 years of age. Scores even tilt toward men in later years.

The words “seems to disappear” are used on purpose. One must be careful with cross-sectional data not to assume that differences across age groups indicate an age-based trend. A recent Gallup poll, for example, asked several different age groups how optimistic they were about finding jobs as adults. Optimism fell from 68% in grade five to 48% in grade 12. The authors concluded that “optimism about future job pursuits declines over time.” The data do not support that conclusion. The data were collected at a single point in time and cannot speak to what optimism may have been before or after that point. Perhaps today’s 12th graders were even more pessimistic several years ago when they were in fifth grade. Perhaps the 12th-graders are old enough to remember when unemployment spiked during the Great Recession and the fifth-graders are not. Perhaps 12th-graders are simply savvier about job prospects and the pitfalls of seeking employment, topics on which fifth-graders are basically clueless.

At least with the data cited above we can track measures of the same cohorts’ gender gap in reading over time. By analyzing multiple cross-sections—data collected at several different points in time—we can look at real change. Those cohorts of nine-year-olds in the 1970s, 1980s, and 1990s, are—respectively—today in their 50s, 40s, and 30s. Girls were better readers than boys when these cohorts were children, but as grown ups, women are not appreciably better readers than men.

Care must be taken nevertheless in drawing firm conclusions. There exists what are known as cohort effects that can bias measurements. I mentioned the Great Recession. Experiencing great historical cataclysms, especially war or economic chaos, may bias a particular cohort’s responses to survey questions or even its performance on tests. American generations who experienced the Great Depression, World War II, and the Vietnam War—and more recently, the digital revolution, the Great Recession, and the Iraq War—lived through events that uniquely shape their outlook on many aspects of life.

What Should be Done?

The gender gap is large, worldwide, and persistent through the K-12 years. What should be done about it? Maybe nothing. As just noted, the gap seems to dissipate by adulthood. Moreover, crafting an effective remedy for the gender gap is made more difficult because we don’t definitely know its cause. Enjoyment of reading is a good example. Many commentators argue that schools should make a concerted effort to get boys to enjoy reading more. Enjoyment of reading is statistically correlated with reading performance, and the hope is that making reading more enjoyable would get boys to read more, thereby raising reading skills.

It makes sense, but I’m skeptical. The fact that better readers enjoy reading more than poor readers—and that the relationship stands up even after boatloads of covariates are poured into a regression equation—is unpersuasive evidence of causality. As I stated earlier, PISA produces data collected at a single point in time. It isn’t designed to test causal theories. Reverse causality is a profound problem. Getting kids to enjoy reading more may in fact boost reading ability. But the causal relationship might be flowing in the opposite direction, with enhanced skill leading to enjoyment. The correlation could simply be indicating that people enjoy activities that they’re good at—a relationship that probably exists in sports, music, and many human endeavors, including reading.

T Tom Loveless Former Brookings Expert

A Key Policy Question

A key question for policymakers is whether boosting boys’ enjoyment of reading would help make boys better readers. I investigate by analyzing national changes in PISA reading scores from 2000, when the test was first given, to 2102. PISA creates an Index of Reading Enjoyment based on several responses to a student questionnaire. Enjoyment of reading has increased among males in some countries and decreased in others. Is there any relationship between changes in boys’ enjoyment and changes in PISA reading scores?

There is not. The correlation coefficient for the two phenomena is -0.01. Nations such as Germany raised boys’ enjoyment of reading and increased their reading scores by about 10 points on the PISA scale. France, on the other hand, also raised boys’ enjoyment of reading, but French males’ reading scores declined by 15 points. Ireland increased how much boys enjoy reading by a little bit but the boys’ scores fell a whopping 37 points. Poland’s males actually enjoyed reading less in 2012 than in 2000, but their scores went up more than 14 points. No relationship.

Some Final Thoughts

How should policymakers proceed? Large, cross-sectional assessments are good for measuring academic performance at one point in time. They are useful for generating hypotheses based on observed relationships, but they are not designed to confirm or reject causality. To do that, randomized control trials should be conducted of programs purporting to boost reading enjoyment. Also, consider that it ultimately may not matter whether enjoying reading leads to more proficient readers. Enjoyment of reading may be an end worthy of attainment irrespective of its relationship to achievement. In that case, RCTs should carefully evaluate the impact of interventions on both enjoyment of reading and reading achievement, whether the two are related or not.