"The readability of scientific texts is decreasing over time", according to a new paper just out. Swedish researchers Pontus Plaven-Sigray and colleagues say that scientists today use longer and more complex words than those of the past, making their writing harder to read. But what does it mean? Here's the key result. This image shows text readability metrics from 709,577 abstracts, drawn from 123 biomedical journals, published in English between 1881 and 2015.

There's been a clear decrease in Flesch Reading Ease scores, along with increases in New Dale-Chall difficulty scores, both of which indicate declines in readability. These metrics are often used to estimate the 'reading level' of a text, and Plaven-Sigray et al. say that "more than a fifth of scientific abstracts now have a readability considered beyond college graduate level English." What's driving the change? These readability metrics are based on a combination of the average sentence length and the average word length (Flesch) or word 'commonness' (Dale-Chall). Plaven-Sigray et al. found that the decreasing readability of abstracts was mostly due to changes in word use, although sentence length has been increasing slightly since 1960. In particular, Plaven-Sigray et al. point to increases in the use of what they call "general scientific jargon" or "science-ese", which they define as "vocabulary which is almost exclusively used by scientists and less readable in general." 'Science-ese', they say, includes words like "moreover", "underlying", "robust", and "suggesting". While not scientific terms per se, these are rarely used outside scholarly discourse today. Plaven-Sigray et al. show that use of these words has increased over time, and have contributed to the decrease in readability. The full list of 'general scientific jargon' can be found in a supplementary file here. Overall, the authors conclude that:

We have shown a steady decrease of readability over time in the scientific literature... Lower readability implies less accessibility, particularly for non-specialists, such as journalists, policy-makers and the wider public... decreasing readability cannot be a positive development for efforts to accurately communicate science to non-specialists.

On the other hand, non-specialists in the past would have struggled to even find a copy of a biomedical abstract, so I'm not sure they would have been able to benefit from its readability. Overall, I'm not surprised by these results. In fact, they match nicely with my own analysis of word use in abstracts, in which I noted that there's been a shift in the words scientists use to describe their findings. While 100 years ago, terms like "notes" and "observations" were preferred, there has been a gradual shift towards more formal, specialized terms like "data" and "results". I speculated about what this means:

The rise of "data" seems to reflect a reversal in the relationship between science and the rest of the world. My impression is that "data" is being used more and more widely in normal discourse but this is a borrowing, so to speak, from science, whereas previously, science was borrowing from everyday life.

Finally, Plaven-Sigray et al. use the acknowledgments section of their paper to say that "This article has a FRE score of 49. The abstract has a FRE score of 40." This post, meanwhile, has a FRE of 55.