At ASAPbio, we’ve generally defined a preprint as an article that has yet to see the completion of journal-organized peer review. But this definition is imperfect (as we’ll discuss below) and isn’t universally shared. For example, an authoritative resource on journal preprint policies, SHERPA-RoMEO, says in its FAQ, “Publishers may use the term pre-print to define all forms of the article prior to print publication. SHERPA follows an academic practise of defining pre-prints as a draft of an academic article or other publication before it has been submitted for peer-review or other quality assurance procedure as part of the publication process.” bioRxiv’s policies resemble (but don’t exactly match) SHERPA’s first definition: it hosts scientific manuscripts prior to journal acceptance. And, adding to the mix, at the NFAIS Foresight Event on preprints, Kent Anderson used the term to refer to manuscript he’d circulated privately to colleagues, but not posted publicly. Chiarelli et al have proposed six values that factor into varied definitions of the term, summarized in figure 1.

Confusion over the term may be cause for alarm from the perspective of policymakers or meta-researchers. Without a single definition of “preprint,” uncomfortable questions abound. How can we measure the growth of preprints when we can’t even be sure which manuscripts are preprints and which are not? How can funders recognize preprints as research products when some of them are not archived or distributed openly? And what papers can a reader trust?

It’s no wonder that calls for a definition of “preprint” recurred at the NISO-NFAIS preprint event. After all, having a single definition would make it easier to assume that all articles bearing have undergone the same degree of scrutiny.

But viewing all preprints with the same eye parallels the fallacy of treating everything that has supposedly been peer reviewed as gospel truth. It’s convenient, but also wasteful and dangerous. Dangerous because heterogeneity in screening practices is inevitable, and wasteful because it ignores in-depth screening and review when it does occur.

Rather than trying to compress all the information about the “state” of a manuscript into the assumed characteristics of a simple label, we need more transparency and clarity about the screening, review, and validation operations that have been performed.

Two levels of transparency are needed. The first is at the server (and for that matter, journal) level, where better descriptions of review and screening policies are needed to help guide authors in choosing where to submit. The second level is at the individual article, which is most beneficial to readers. As Neylon et al note, variation can occur within a single publication venue; as pointed out by Carpenter, there is often no information beyond on manuscript state beyond the name of the container in which it appears.

Below, I provide some examples of the potential variation among manuscripts that are all called preprints, and the kinds of metadata that could be exposed.