In this article, we construct estimates of the POS and other related risk characteristics of clinical trials using 406 038 entries of industry- and non-industry-sponsored trials, corresponding to 185 994 unique trials over 21 143 compounds from Informa Pharma Intelligence’s Trialtrove and Pharmaprojects databases from January 1, 2000 to October 31, 2015. This is the largest investigation thus far into clinical trial success rates and related parameters. To process this large amount of data, we develop an automated algorithm that traces the path of drug development, infers the phase transitions, and computes the POS statistics in hours.

They have some 400,000 data points to work with, roughly one-third of which are associated with industrial drug development. About 15% of the large set also had no termination date associated with the trials, so median lengths were imputed, and trials were marked as failed if no further action was observed after defined intervals. They count a trial, very reasonably, as the investigation of a particular drug for a single indication. If a trial is terminated early for any reason except early positive data, it’s marked as failed, and if a drug makes it through one phase and does not move on to the next, it’s listed as “terminated in Phase X”. A difference between this paper and others is that they’re trying to get “path by path” numbers, teasing out individual drug projects and counting them up, as opposed to finding (say) the total number of Phase II trials that started in a given period (a “phase by phase” approach, as the paper has it). As they point out, this is really only possible in more recent years when registration of trials has become mandatory (the data set itself, though, covers 2001-2015, and clinicaltrials.gov registration became mandatory in 2007).

They come out with higher success rates than the other studies in this area. The standard estimates for overall probability of clinical success is about 10%, but this study has 13.8% of all pathways actually making it through. The biggest difference is in the Phase II-Phase III transition, and this is thought to be due to better coverage of missing trials.

A closer look at the data, though, tells an even more different story. That overall POS figure is heavily dragged down by low success rates in oncology. Of the 41040 total pathways in the set, 17368 are for oncology (note that the same drug tried against two different types of cancer will show as two different pathways). The POS of everything outside of oncology is 20.9%, which the POS in oncology itself is 3.4%. If you look at lead indications, instead of all indications, the POS goes up overall (which is in line with earlier studies). But the Phase 2 to Phase 3 transition rate actually goes down a bit, interestingly. Oncology is still the lowest of bunch.

The authors tried to see if biomarkers are helping out (since they’re supposed to). Only 7% of the trials used a biomarker at all stages of development – some used them only for patient selection at the start, for example. Almost all the biomarker-using trials (of any kind) are post-2005. Of the trials that use them to stratify patients at the start (which are almost all oncology trials), the POS nearly doubles, which is good to see. But the broader picture is messier: