Patents from papers both basic and applied Public funding for research depends on the idea that the resulting knowledge translates into socially valuable outcomes, such as medicines. Such linkages are easier to assert than to prove. Li et al. studied 27 years of grant-level funding by the U.S. National Institutes of Health. About 10% of grants are directly cited by patents, suggesting some technological application, and 30% of grants are cited in research articles that are then cited in patents. Five percent of grants result in papers cited by patents for successfully approved drugs, compared with less than 1% that are cited directly by such patents. These patterns hold regardless of whether the research is more basic or applied. Science, this issue p. 78

Abstract Scientists and policy-makers have long argued that public investments in science have practical applications. Using data on patents linked to U.S. National Institutes of Health (NIH) grants over a 27-year period, we provide a large-scale accounting of linkages between public research investments and subsequent patenting. We find that about 10% of NIH grants generate a patent directly but 30% generate articles that are subsequently cited by patents. Although policy-makers often focus on direct patenting by academic scientists, the bulk of the effect of NIH research on patenting appears to be indirect. We also find no systematic relationship between the “basic” versus “applied” research focus of a grant and its propensity to be cited by a patent.

The claim that investments in publicly funded science ultimately have practical application is perhaps the central assumption in postwar science policy (1). Although private-sector research and development (R&D) investments can be more easily linked to a firm’s own marketed products, knowledge generated by public investments in science is often meant to be freely accessible to multiple other parties, making it difficult to keep track of whether and by whom this knowledge is used. Moreover, publicly funded research may have applications far from its original area, many years or even decades later, making the links between funding and commercial use difficult to predict (2). When public investments in science lay a foundation for innovation by others—with heterogeneous time lags and spillovers across topics—how can we credit these investments for contributing to the development of these innovations?

We analyze the output of research grants awarded by the U.S. National Institutes of Health (NIH), the world’s largest single funder of research in the life sciences, with an annual budget of over US$30 billion (appendix A). NIH provides support for one-third of biomedical R&D in the United States overall, as well as the majority of funding for so-called “basic” biomedical research (3). Using data on life-science patents (including drugs, devices, and other medical technologies) linked to NIH grants over a 27-year period, we provide a method for large-scale accounting of linkages between public research investments and commercial applications. Recognizing that some patents are more valuable than others, we also examine linkages between NIH grants and patents associated with marketed drugs (appendix B). Although many patents are associated with development efforts that ultimately failed, patents on drugs approved by the U.S. Food and Drug Administration (FDA) indicate inventions that firms found valuable enough to marshal through the costly testing and launch process and that the FDA views as safe and effective.

There are two basic ways through which NIH-funded research may affect patenting and drug development. First, NIH-funded scientists may themselves produce patents. The 1980 Bayh-Dole Act created incentives for these researchers and their institutions, typically universities, academic medical centers, and nonprofit research institutes, to patent their discoveries so that they could be licensed to private firms. The act required institutions to report patents resulting from public funding to the government. This reporting requirement enables us to identify patents that are directly produced as a result of NIH funding (appendix C). Public funding for biomedical research, however, is typically intended to have an effect beyond the direct production of patents. To capture this broader effect, our second measure identifies private-sector patents that cite NIH-funded research. We collect all scientific publications that are listed in the “References Cited” section of private-sector patents, determine which articles result from NIH funding, and identify the grant numbers for those that do (appendices C and D). Scientific references generated during the patent application process are part of the “prior art” against which patent examiners judge the patentability of inventions. References to prior articles are thus similar to references to prior patents, which have been widely used to examine the effect of science. Patent-article references, however, have two major advantages: (i) publications rather than patents are the primary output of academic research, and (ii) unlike citations to other patents, citations to published articles are much more likely to come from patent applicants themselves rather than from patent examiners (4). Although citations to articles contained in patent documents are not perfect measures of knowledge flows, validation exercises against survey data suggest that patent-article citations provide better signals of the intellectual influence of public science than previously used measures (5).We are able to identify patents that build on NIH-funded research without making a priori assumptions about the diffusion of scientific knowledge over time and across disease areas (e.g., whether grant funding by the National Cancer Institute leads to research cited by patents on AIDS treatments). Appendix E provides details on the process followed to pair life science patents with the individual PubMed records they reference.

Our sample consists of 365,380 grants funded between 1980 and 2007, almost all NIH grants over this period. Nearly half of these (164,378) are R01-equivalent grants, large project-based renewable grants that form the foundation of NIH’s extramural spending. A total of 30,829 (8.4%) of these grants are directly acknowledged by patents, leading to 17,093 “Bayh-Dole” patents assigned primarily to universities and hospitals. A much larger set of grants, 112,408 (31%), produces research that is cited by 81,462 private-sector patents in aggregate (note that these two channels are not mutually exclusive). These indirectly linked patents demonstrate the additional reach that publicly funded science can have by building a foundation for private-sector R&D.

Figure 1A describes the lag times between NIH funding and follow-on patenting both via direct acknowledgements and indirect citation linkages. At a given point t on the x axis, we plot the proportion of t–year-old grants that have been linked to a patent. This curve is generally increasing because a grant’s likelihood of being linked to a patent increases with age. In some cases, these curves turn downward in later years because of cohort effects; e.g., the proportion of grants linked to patents after 25 years does not include grants less than 25 years old (because these figures conflate time and cohort effects, we report a survival analysis in appendix G that separately controls for grant cohort). The difference in the number of patents we are able to link to public science funding via these two different approaches is immediately apparent.

Fig. 1 Grant-patent lags, direct versus indirect patenting. (A and B) Based on a sample of 365,380 NIH grants awarded between the years 1980 and 2007. A grant is directly linked to a patent if the patent contains a government interest statement explicitly referencing the grant. A grant is indirectly linked to a patent if a publication acknowledges the grant within 5 years of the start of a particular funding period for the grant (covering the fiscal year in which it is first disbursed up until the year the funding runs out, typically 3 to 5 years), and a patent cites this publication as prior art. For each year after approval, the percentage of linked patents is calculated using only grants that have reached that age.

Our results so far indicate that, although Bayh-Dole and other policies emphasize patenting by academic researchers themselves, the effect of NIH research through traditional channels—private patents citing publications from NIH grants—is almost four times greater. Moving forward, we adopt this as our preferred measure of patenting associated with NIH funding.

We look separately at patents associated with drug approvals, using data from the FDA. In general, there are far fewer such patents—only 4414 of the life science patents in our sample are associated with FDA-approved drugs—meaning that a smaller proportion of NIH-funded grants will be linked to such patents. Less than 1% of NIH grants are directly acknowledged by a patent associated with a marketed drug (Fig. 1B), but 5% of grants result in a publication that is cited by a patent associated with the marketed drug. Here again, the indirect effect dominates the effect via the direct Bayh-Dole channel.

The question of whether more “basic” or “applied” grants are ultimately more valuable for progress is an old one in science policy (1, 6). One complication is that there is no consensus on the definitions and distinctions between the two (7, 8). “Basic” research has been variously defined by whether it seeks general or specific knowledge (9), by the institutional environment where it takes place and the norms regarding dissemination (10), by whether it is undertaken for its own sake or with some application in mind (7), and by whether or not it is targeted to a specific program or mission (6), among other ways.

Rather than try to resolve this debate, we examine four different dimensions that have been of interest to medical research policy-makers: whether the research is disease-oriented, whether it is focused on patients (6, 11), and whether it is solicited by the funder or is investigator-initiated (12); for the subset of grants that are not disease-oriented, we also examine the complexity of the model organism studied (13). Except for the solicited versus investigator-initiated distinction, all the “basicness” measures rely on a semantic mapping, using a natural language processing tool (the Medical Text Indexer), between funded grant abstracts and Medical Subject Heading (MeSH) keywords, the controlled vocabulary maintained by the National Library of Medicine (appendix F).

A grant is said to be disease-oriented if its abstract can be mapped to at least one MeSH term corresponding to a disease (i.e., the MeSH code starts with the letter C). By this measure, 183,517 grants (50% of our sample) are disease-oriented.

Distinguishing patient-oriented grants from other projects is straightforward, because the MeSH controlled vocabulary includes a term for humans. Patient-oriented grants defined in this way include (but are not limited to) research that uses human subjects. Using this measure, 177,692 grants (49% of our sample) are patient-oriented.

Whether the research was solicited, via a request for applications (RFA), is based on NIH administrative data. RFAs (24% of our sample) are typically used to direct research at particular diseases or problems and thus are more likely to represent applied work.

We use MeSH terms to classify NIH grants by the complexity of the model organism they propose to study. Although admittedly crude, this taxonomy captures the idea that scientists are more likely to bear the financial and logistical costs of working with higher-order animal models when conducting research intended to be more applicable to humans. In contrast, simple organisms are often chosen to elucidate fundamental biological phenomena without consideration of therapeutic usefulness (14).

For this classification, we restrict our sample to grants that are not disease-oriented, based on the first measure above, to eliminate clinical or translational research that happens to study the effect of viruses or bacteria. We focus on grants that mention at least one organism in the abstract and take into account the natural hierarchy of model organisms by grouping them into coherent nonoverlapping sets: viruses, prokaryotes, unicellular eukaryotes, multicellular eukaryotes, invertebrates, vertebrates, rodents, other mammals, primates, and finally humans. When an abstract can be mapped to two or more levels of this hierarchy, we assign the grant to the higher-order organism (appendix F).

Grants targeting diseases are more likely to produce research that is cited by a patent, but this difference is small: 35% of disease-oriented grants versus 30% of non–disease-oriented grants (Fig. 2A). When we examine grants linked to patents on FDA-approved drugs, we find that non–disease-targeted grants yield a similar number of high-value patents (Fig. 2B). The difference in these curves suggests that although non–disease-oriented research may take more time to yield drug-related patents, its value levels off less slowly over time.

Fig. 2 Grant-patent lags, by basic or applied orientation. (A and B) A grant is designated disease-targeted if its abstract can be mapped to at least one MeSH term corresponding to a disease through the Medical Text Indexer. (C and D) A grant is designated patient-oriented if its abstract can be mapped to the MeSH term for humans through the Medical Text Indexer. (E and F) A grant is designed as RFA if it is submitted as part of a request for applications. Bayh-Dole patents that cannot be linked to a grant through a publication are excluded from the analysis. See appendix F for further details on these classifications.

Non–patient-oriented research yields patents at virtually identical rates to patient-oriented research (Fig. 2C). Non–patient-oriented research appears to continue accruing patents associated with FDA drugs even after this levels off for patient-oriented research (Fig. 2D). Non–RFA-solicited research, more likely to be basic, produces patent output similar to RFA-solicited research (Fig. 2E), although this time we find slightly more FDA-approved drugs for the set of RFA-solicited grants (Fig. 2F).

Even non–disease-oriented research on simple organisms is almost as likely to produce research that is linked to patents as research on “higher-order” organisms (Fig. 3). Taken together, Figs. 2 and 3 suggest that, based on our measures, basic and applied grants are quite similar in their linkages to commercial patenting.

Fig. 3 Grant-patent lags, “animal kingdom” ordering. Grants are assigned to animal kingdom categories based on the highest model organism that their abstract can be mapped into, through the Medical Text Indexer. The grants considered in this analysis exclude disease-oriented grants. Bayh-Dole patents that cannot be linked to a grant through a publication are excluded from the analysis. See appendix F for further details on this classification.

Our research builds on and extends previous work in several ways. Although a considerable body of research has examined academic patenting linked to public research (15), and some authors have done so at the grant level (16), ours compares the relative magnitude of patenting through direct and indirect channels using individual grant data. Although Sampat and Lichtenberg (17) examined the relative importance of these two channels for marketed drugs, their analysis was retrospective, whereas ours is prospective. Other papers (18) that take a prospective approach only consider one of the two channels, and only for a subset of NIH grants. The paper also adds to a long line of previous bibliometric research (19) not only by linking patents to scientific articles but also by linking the articles back to funding sources and by attempting to categorize these grants by different measures of “basicness.”

Although our analysis is a large-scale evaluation of different types of linkages between NIH research and private patenting, there are important limitations. There may be underreporting of Bayh-Dole patents to the federal government by academic institutions, which would understate the importance of the direct linkages (20). Measuring indirect linkages through patents citing articles is also imperfect. Applicants may have incentives to overcite known prior art (21), and the extent to which they search for prior art may vary by invention importance (22). Citations are made to satisfy legal criteria and may not necessarily reflect strong intellectual influences. On the other hand, our approach may underestimate linkages between NIH funding and patenting because not all intellectual influences are embodied in articles—e.g., the effects of NIH training. While patent-paper references improve on previous measures of knowledge flows (see above and the supplementary materials), more work is needed to understand potential noise or biases in these measures. Although we look only at first-generation citations, some grants may generate articles that are not cited by patents but are cited by other articles that in turn are cited by patents. This would lead us to underestimate links between NIH funding and patents. Finally, our measures of “basicness” only capture, imperfectly, some of the relevant dimensions in the age-old debates regarding basic versus applied research.

Despite these limitations, we provide several new stylized facts. About a third of NIH grants generate research that is cited by commercial patents. This is much greater than the share of grants directly yielding patents (less than 10%), even though policy-makers often focus on this easier-to-grasp metric to capture the near-term economic returns to public funding of biomedical R&D (23).

There is no obvious relationship between “basicness” and likelihood of being cited by a patent. One interpretation of this is that “basic” research is nearly as productive as “applied” research, which may be surprising to those who question its value (24). On the other hand, we find little evidence for claims that basic research is substantially more impactful over the period we study (1, 25). Our results are consistent with arguments that the basic/applied distinctions may not be so useful in thinking about what types of research funding is more productive.

Supplementary Materials www.sciencemag.org/content/356/6333/78/suppl/DC1 Appendices A to G References (27–57)

Acknowledgments: P.A. acknowledges the financial support of the National Science Foundation through its Science of Science and Innovation Policy (SciSIP) Program (award SBE-1460344).