A core principle of good public health practice is to base all policy decisions on the highest-quality scientific data, openly and objectively derived.1 Determining whether data meet these conditions is difficult; uncertainty can lead to inaction by clinicians and public health decision makers. Although randomized, controlled trials (RCTs) have long been presumed to be the ideal source for data on the effects of treatment, other methods of obtaining evidence for decisive action are receiving increased interest, prompting new approaches to leverage the strengths and overcome the limitations of different data sources.2-8 In this article, I describe the use of RCTs and alternative (and sometimes superior) data sources from the vantage point of public health, illustrate key limitations of RCTs, and suggest ways to improve the use of multiple data sources for health decision making.

In large, well-designed trials, randomization evenly distributes known and unknown factors among control and intervention groups, reducing the potential for confounding. Despite their strengths, RCTs have substantial limitations. Although they can have strong internal validity, RCTs sometimes lack external validity; generalizations of findings outside the study population may be invalid.2,4,6 RCTs usually do not have sufficient study periods or population sizes to assess duration of treatment effect (e.g., waning immunity of vaccines) or to identify rare but serious adverse effects of treatment, which often become evident during postmarketing surveillance and long-term follow-up but could not be practically assessed in an RCT. The increasingly high costs and time constraints of RCTs can also lead to reliance on surrogate markers that may not correlate well with the outcome of interest. Selection of high-risk groups increases the likelihood of having adequate numbers of end points, but these groups may not be relevant to the broader target populations. These limitations and the fact that RCTs often take years to plan, implement, and analyze reduce the ability of RCTs to keep pace with clinical innovations; new products and standards of care are often developed before earlier models complete evaluation. These limitations also affect the use of RCTs for urgent health issues, such as infectious disease outbreaks, for which public health decisions must be made quickly on the basis of limited and often imperfect available data. RCTs are also limited in their ability to assess the individualized effect of treatment, as can result from differences in surgical techniques, and are generally impractical for rare diseases.

Many other data sources can provide valid evidence for clinical and public health action. Observational studies, including assessments of results from the implementation of new programs and policies, remain the foremost source, but other examples include analysis of aggregate clinical or epidemiologic data. In the late 1980s, the high rate of the sudden infant death syndrome (SIDS) in New Zealand led to a case–control study comparing information on 128 infants who died from SIDS and 503 control infants.9 The results identified several risk factors for SIDS, including prone sleeping position, and led to the implementation of a program to educate parents to avoid putting their infants to sleep on their stomachs — well before back-sleeping was definitively known to reduce the incidence of SIDS. The substantial reduction in the incidence of SIDS that resulted from this program became strong evidence of efficacy; implementation of an RCT for SIDS would have presented ethical and logistic difficulties. Similarly, the evidence base for tobacco-control interventions has depended heavily on analysis of the results of policies, such as taxes, smoke-free laws, and advertising campaigns that have generated robust evidence of effectiveness — that is, practice-based evidence.

Current evidence-grading systems are biased toward RCTs, which may lead to inadequate consideration of non-RCT data.10 Objections to observational studies include the potential for bias from unrecognized factors along with the belief that these studies overestimate treatment effects.11 Although overestimation bias has been shown in some observational studies (e.g., overestimation of the effect of influenza vaccination on reducing mortality among older persons as a result of bias from healthy vaccine recipients12), comparisons of validity between observational studies and RCTs have dispelled many misperceptions.4,6,13,14 A widely cited example involves the cardiovascular health risks associated with the use of menopausal hormone therapy. Data from an observational study suggested that menopausal hormone therapy would reduce the risk of heart disease15; results from a subsequent RCT showed increased cardiovascular risks.16 Although initially these differences were thought to indicate weaknesses in the observational study, further analyses determined that both studies had valid results for their patient populations and that discrepancies were probably due to the timing of initiation of hormone therapy in relation to the onset of menopause.17-21 If so, then the RCT and observational study showed similar findings. However, a broad recommendation to use hormone therapy was made prematurely. Determining when data are sufficient for action is difficult, but the bar should be much higher when recommending that millions of persons with no disease take medications. This line of reasoning does not suggest that the Food and Drug Administration should be less stringent in their review of drug safety and efficacy, but rather that there should be rigorous review of all potentially valid data sources.

No study design is flawless, and conflicting findings can emerge from all types of studies. The following examples show the importance of recognizing the strengths and limitations in all data sources and finding ways to obtain the most useful data for health decision making.