The following rant focuses on sepsis research, but these principles are universal. I apologize for the agitated nature of this post, but I just can’t hold it in any longer. If I read one more correlational study which tries to imply causation, I might just explode. In order to prevent burnout, I’m going to journal these thoughts instead. So, hang on to your hats, because things are going to get a bit heated.

Repeated observational studies in sepsis are fake news

Time-to-intervention studies are the most obvious offenders.

On the display floor, an exhibitor promised to reduce "door to speculum time." Just take me now, lord. — Joe Lex (@JoeLex5) October 15, 2013

Three years ago, I wrote a post about the fallacy of time-to-intervention studies. The gist was as follows. Many studies attempt to retrospectively correlate outcomes with the time delay to therapy (e.g. door-to-needle time, door-to-antibiotic time, and my personal favorite, door-to-furosemide time). Numerous confounders tend to produce positive correlations, for example:

Prompt treatment may correlate with better care overall (e.g. treatment at higher-quality centers, during daytime hours, in the absence of overcrowding).

Rapid treatment may be a surrogate marker for patients who were previously healthy and present in a straightforward fashion, facilitating easy triage and diagnosis.

We’ve been hoodwinked by time-to-intervention studies before. One example is the notorious four-hour antibiotic rule in pneumonia. A few time-to-intervention studies suggested that early antibiotics for pneumonia improved outcomes. Guidelines and government regulators mandated that antibiotics must be given within four hours. Eventually it became clear that this mandate promoted misdiagnosis and clostridium difficile, causing it to be rescinded.

Time-to-intervention studies are seductive, because they make our pet interventions appear highly effective. We want to believe these studies. They are exciting. If we could just give everybody in the waiting room of the emergency department two liters of fluid and a dose of cefepime, they would all do great! Yahoo!

When I wrote that blog in 2014, I thought that it was a fairly obvious point (verging on being boring). I was wrong. The literature continues to brim with time-to-intervention studies. A new time-to-intervention study in sepsis surfaces every 3-6 months, often in major journals and to the receipt of substantial fanfare. The most prominent recent example was Seymour 2017.1

Time-to-intervention studies are methodologic junk. I’d like to perform a study correlating survival with door-to-commode time in septic shock. The study would evaluate thousands of patients, looking at the delay between hospital admission and the time when the patient was first able to sit on a commode to move their bowels. Shorter door-to-commode time would correlate with survival (because sicker patients take longer to recover enough to sit up). The obvious conclusion, of course, is that if we want septic patients to survive we need to give them a bunch of laxatives and force them onto a commode to have a bowel movement. Problem solved!

Other correlative studies are bad too

Time-to-intervention studies might be the most obvious offenders, but the literature is replete with other observational studies which gingerly attempt to imply causation. For example, let’s explore Levy 2018, a fresh study released online a few days ago:2

In 2013, New York State mandated that hospitals follow a sepsis protocol and report outcomes. This study correlates compliance with the sepsis bundle with reduced mortality. The implication is that to reduce sepsis mortality we must enforce sepsis bundles using mandated reporting.

The number of confounding factors here is legion. For example, compliance with the sepsis protocol may simply be a surrogate marker of better care. Based on the retrospective, correlational construction of this study, it’s impossible to distinguish between causation and confounding.

If we look a bit deeper into the study, it becomes clear that the results are weird and nonsensical. Let’s take a look at the bundle elements that they studied:

Three-hour bundle: 3A) Administration of antibiotics 3B) Drawing blood cultures before administration of antibiotics 3C) Measuring blood lactate levels

Six-hour bundle: 6A) Administration of 30 cc/kg fluid for patients with hypotension or elevated lactate 6B) Vasopressors for refractory hypotension 6C) Repeat lactate measurement



Early antibiotics, fluid, and vasopressors are probably the most useful elements here.3 Blood cultures can be helpful, but their yield is low among all comers with sepsis – so it’s unlikely that blood cultures would improve mortality much. Lactate is useful to identify sick patients, but once patients are already enrolled into a sepsis protocol the value of serial lactate levels is dubious.

The study results are shown above. According to this data:

The most important elements are blood cultures and lactate measurement. Alternatively, vasopressor administration for refractory hypotension doesn’t matter at all.

Checking a blood culture within three hours is equally as beneficial as completing the entire three-hour bundle (red boxes). So, if you check a set of blood cultures you don’t need to bother giving antibiotics.

This doesn’t make sense. We seem to be unraveling a complex nest of associations, rather than isolated causal relationships.

To add insult to injury, the study seems to imply that the reason the physicians were performing these interventions was because of mandated public reporting. However, there is no control group to test this hypothesis (e.g. comparison to another state which didn’t mandate reporting). Would the doctors have provided the same care without mandated reporting? Or perhaps better care? It’s impossible to tell, so any conclusion here is pure speculation.

Why keep generating hypotheses, when the hypotheses already exist?

Observational studies described above are correlational studies, which can never prove causation. As such, they are solely useful for hypothesis generation. However, hypotheses about these topics have already been generated – in fact they’ve been around for years. So, what is the purpose of a hypothesis-generating study, when hypotheses already exist?

Scientifically, these studies serve no discernable purpose. These studies instead appear to be thinly veiled attempts to prove established hypotheses using correlational data. This isn’t scientifically valid, but that doesn’t matter. If you repeat something enough times in major journals, people will start believing it’s true. Thus, these studies function largely as fake news: an insubstantial story which is dressed up to look and feel like scientific proof.

Multicenter RCTs are real news

The most rigorous way to answer questions about sepsis treatment is multicenter RCTs. Unfortunately, recently one of the most promising such studies has been threatened.

The CLOVERS trial is a prospective multi-center RCT comparing liberal crystalloids versus early vasopressors in septic shock. It is sponsored by the NHLBI and the PETAL investigators, highly esteemed organizations. Its authors are a who’s who of prominent researchers, with enrolling sites that include dozens of leading institutions (e.g. MGH, Brigham, Stanford, Duke, UPMC). The rationale for the trial is explained in a thoughtful article in the Annals of Emergency Medicine. In short, this is an extremely promising study.

Unfortunately, the study has recently met resistance from Public Citizen, a think tank in Washington DC founded by Ralph Nader.4 The group has sought media exposureto accuse the CLOVERS trial of turning patients into “unwitting guinea pigs in a physiology experiment that will not advance medical care for sepsis.” The group has various quibbles with study design, which don’t seem valid. For example, they fault CLOVERS for including only two arms, without including a third usual care arm. However, choosing two well-defined study arms is probably a wise decision, in order to ensure adequate separation of the arms and to maximize statistical power. A usual care arm would be murky, since it would consist of a hodgepodge of both strategies (depending on the whims of the treating clinician).

Methodologic foibles aside, halting a major RCT would set a dangerous precedent. There are already innumerable barriers in place to performing high-quality clinical research. Adding additional barriers (e.g., anti-research think tanks) could easily make it insurmountably difficult to accomplish clinical research in the United States.

History has taught us that RCTs frequently reveal surprising results, with major public health benefits. For example, the ORBITA trial recently compared stenting versus a sham procedure among patients with refractory chronic angina. If that study had been performed in the United States, groups like Public Citizen would undoubtedly have agitated that it was an unethical trial (perform a sham procedure to someone with refractory angina?!?). Of course, the trial revealed no benefit from stenting – proving that the study was in fact both ethical and enormously valuable.

As evidenced above, I have lots of opinions and I’m not shy about sharing them. However, I’ve never criticized a study which is in progress. There are already rigorous systems in place to ensure that trials are performed safely and ethically (e.g., IRB review, data safety monitoring boards). Interference with ongoing clinical trials is a slippery slope, because there’s always something to criticize about any study. It’s best for us to allow trialists to complete their studies unfettered by external criticism.

Retrospective studies, which correlate the timeliness of an intervention with outcomes, have numerous confounding factors and cannot ever prove causation. This study design is extremely weak and probably doesn’t merit publication in major journals.

Other forms of observational, correlative studies cannot prove causation either.

There is little scientific value in performing repeated observational studies in attempts to prove an established hypothesis. The observational studies cannot either prove nor refute the hypothesis.

Multicenter RCTs are needed to prove causality. We should encourage such studies and allow them to be performed without external interference.

Related heretical material about sepsis

1. N Engl J Med. 2017;376(23):2235-2244. PubMed] Seymour C, Gesten F, Prescott H, et al. Time to Treatment and Mortality during Mandated Emergency Care for Sepsis.. 2017;376(23):2235-2244. 2. Am J Respir Crit Care Med. September 2018. PubMed] Levy M, Gesten F, Phillips G, et al. Mortality Changes Associated with Mandated Public Reporting for Sepsis: The Results of the New York State Initiative.. September 2018. 3. I don’t advocate giving every single patient 30 cc/kg fluid as a blind mandate, but on average when applied to a population of patients this is probably a beneficial intervention. 4. This is proof that whenever you think that things in Washington DC can’t get any weirder, they do. .