In line with current medical risk assessment practices (e.g., in oncology 48 - 50 , surgery or cardiology 51 - 54 ), we used the ICPP IPD to develop a prediction function that estimates the probability of PTSD given a set of early, observable risk indicators. Following replicated demonstrations of their predictive yield in classification models 55 - 62 , we positioned PTSD symptoms as a key predictor, subsequently enriching the predictive models by including other previously documented and clinically‐obtainable risk indicators available in the ICPP dataset (e.g., gender, trauma type, lifetime trauma history).

An analysis of IPD, or mega‐analysis, offers a sensible approach to aggregating data across studies 44 , 45 . Unlike systematic reviews and meta‐analyses, mega‐analyses do not rely on the original studies’ data analytic approaches and reporting perspectives and enable direct estimates of parameters of interest (i.e., predictors, outcomes). This allows data source heterogeneity and subgroup variations to be examined directly, and makes it possible to interrogate the combined data in ways not considered, or impossible, in the component studies, due to their sample sizes or limited population diversity 46 , 47 .

Quantifying individuals’ PTSD risk following acute care trauma admission could provide an empirical foundation for mitigating and preventing a major public health issue. Towards that goal, members of the International Consortium to Predict PTSD (ICPP) shared item‐level data from ten longitudinal, acute care based studies of the early development of PTSD, performed in the US, Australia, Japan, Israel, Switzerland, and The Netherlands. The data were harmonized, pooled into a single individual participant‐level dataset (IPD) and submitted to data analysis.

The prevalence of PTSD after ED admissions resembles that seen in survivors who do not require or receive ED care – e.g., 52% incidence of new PTSD among women survivors of interpersonal violence admitted to EDs vs. 51‐76% among women surveyed in shelters, domestic‐violence clinics and therapy groups 41 , 42 . The 18‐month prevalence of PTSD among drivers admitted to general hospitals after injury‐producing car crashes (11%) is somewhat higher than that of car drivers not seen in EDs (7%) 43 .

Trauma admissions to acute care centers and emergency departments (EDs) offer a first point of contact with numerous survivors at risk. EDs evaluate in the US over 39 million individuals yearly for treatment of traumatic injury 35 - 39 . Worldwide, road traffic accidents, a mainstay cause of ED admissions, cause an estimated 1.25 million deaths and over 20 million non‐fatal injuries yearly 40 .

Longitudinal studies have nonetheless reported numerous group‐level PTSD risk indicators 21 , 22 , such as female gender 23 , 24 , age 23 , education 25 , ethnicity 26 , lifetime exposure to traumatic events 27 , and marital status 24 . Several symptom‐based case predictions have been developed, consistently performing better than chance 28 - 31 , but unable to build a reliable, personalized risk estimator 32 . Meta‐analyses 21 , 22 and systematic reviews 21 , 22 , 33 , 34 have similarly endorsed group‐level risk indicators without a clear path to clinical implementation 34 .

Previous studies have had difficulty producing such estimates, due to the multiplicity, complexity and distributional variation of PTSD risk indicators. Additionally, most studies have attempted to predict cases (i.e., who will develop PTSD) rather than produce PTSD likelihood estimates for every participant (i.e., how likely is a person to develop PTSD) 19 , 20 .

Early cognitive behavioral interventions significantly reduce the prevalence of PTSD, and their effect is stable 8 , 15 , 16 . These interventions, however, are resource‐demanding, and unnecessary for low‐risk survivors, whose symptoms subside spontaneously 15 , 17 . Thus, an accurate individual estimate of survivors’ risk for chronic PTSD is a prerequisite for efficient prevention and service planning 18 .

Post‐traumatic stress disorder (PTSD) is the most frequent psychopathological consequence of traumatic events 1 , 2 . Chronic PTSD is tenacious, debilitating and frequently intractable 3 - 9 . Early PTSD symptoms are sensitive but non‐specific predictors of chronic PTSD 10 . They subside in over 70% of those expressing them 11 - 13 , whilst few initially asymptomatic survivors develop delayed‐onset PTSD 14 .

The selected time window for determining endpoint PTSD status (122‐456 days; 4‐15 months) maximized the number of ICPP studies included in each time interval. To evaluate whether the substantial width of that time window affected the results, and to additionally produce an estimate of prolonged PTSD likelihood, we repeated the logistic regressions using participants whose PTSD status was obtained 9 to15 months (273‐456 days) after the traumatic events.

Differences in the predicted probability of PTSD given different risk factors were estimated by drawing 1,000 posterior simulations of each model's β coefficients, predicting endpoint PTSD at each value of CAPS 0 with different risk profiles (e.g., male versus female gender), and evaluating the differences in the predicted probabilities across baseline CAPS 0 scores 80 .

To evaluate the two models, we compared the predictive fits of the fixed effects and the random effects logistic regressions with CAPS 0 as the only predictor, using a bootstrap approach where participants were randomly sampled with replacement, models were obtained, and then predicted probabilities from both models were estimated among the left‐out participants. For each approach, the ratio of expected PTSD diagnoses and actual PTSD diagnoses (expected/observed or E/O), the calibration slope β overall (the slope from a logistic regression of the predicted probabilities on endpoint PTSD), and the Brier score were obtained. An E/O far from 1 indicates whether the model's intercept, which determines the predicted prevalence of PTSD, is too high or too low, while the calibration slope reflects heterogeneity of the predictor‐outcome associations or over‐fitting of the data 44 . This process was repeated 100 times with statistics averaged across iterations. A finding of poorer results in the fixed effects model compared to the random effects model would indicate that the studies were too heterogeneous to be analyzed together after accounting for differences in the distribution of CAPS 0 .

Two options were considered for selecting the regression model's intercept: a fixed effects intercept, where a common intercept is estimated after pooling or “stacking” the data together, and a random effects intercept, where the intercept is allowed to vary by study 44 . Random effects (or stratified approaches) have not been recommended when the prevalence of an outcome varies substantially between studies 44 , as is the case with the ICPP studies. Alternatively, it could be hypothesized that heterogeneity in endpoint PTSD prevalence across ICPP studies reflected heterogeneity in the distribution of CAPS 0 severity across studies, which was due to variability in studies’ sampling routine. Under this hypothesis, ICPP studies could be seen as representing different samplings from a common parent population of acute care trauma admissions.

The Brier score 79 measures the accuracy of probabilistic predictions. It expresses the mean standard error of the squared difference between the estimated probabilities and the true PTSD classification. Its range is 0 to 1. A Brier score of zero represents a perfect model and scores of 0.25 or greater signal a non‐informative model. Efron's R 2 is the correlation between the predicted probabilities and the smoothed probabilities.

Logistic regression models were obtained using CAPS 0 as the only predictor (CAPS 0 model), CAPS 0 plus all risk predictors (full model), and CAPS 0 plus significant predictors only (significant predictors model). The models’ fits were evaluated using the Brier score 79 , Efron's R 2 , model's predicted‐to‐raw ratio, and the area under the receiver operating characteristic curve (AUC).

The relatively large sample size in the ICPP dataset enabled us to obtain simple raw estimates of the probability of downstream PTSD for each CAPS 0 score. The estimator used was the fraction of PTSD cases among all individuals with a given CAPS 0 score, smoothed with a window of five adjacent points.

Differences in frequency and severity of risk predictors between participants with and without endpoint PTSD were assessed using Mann–Whitney tests for continuous risk predictors and χ 2 tests for categorical risk predictors. The number of participants endorsing each CAPS 0 severity score (smoothed for five‐points intervals) was visualized using a histogram, separately for all participants and for those with PTSD at the study's endpoint.

Participants missing at least one variable (N=791; 32%) differed from those with complete data (N=1,682) with respect to several risk indicators (Table 2 ). To address these missing observations, we present analyses in which missing predictors were handled by multiple imputation using chained equations (MICE) performed on the IPD 77 . Ten imputed datasets were created after twenty iterations and the results were pooled using Rubin's method 78 . For completeness, we also computed the results using individuals who had complete data (i.e., without imputation). The results did not differ substantially from those obtained after imputation and are available upon request.

CAPS 0 data were available for all 2,473 participants. Data on age, gender, and current trauma were available for >99% of the sample. Marital status was missing in 4.5%, education in 6.2%, ethnicity in 12.3%, and prior trauma in 16.8% of the sample.

Differences in data collection and instruments across studies required harmonization of four risk indicators. Educational attainment, which varied by participating countries’ schooling systems, was recoded into a binary variable of less than secondary education versus completion of at least secondary education. Recoding participants’ lifetime exposure to traumatic events followed a previous demonstration of a strong association between interpersonal trauma and PTSD 76 and included: a) exposure to at least one instance of interpersonal violence (e.g., physical or sexual violence, war or terror), b) in the absence of the former, exposure to at least one instance of non‐interpersonal trauma (e.g., road traffic accidents), and c) no trauma exposure. Traumatic events leading to current acute care admission were categorized as motor vehicle accidents, other non‐interpersonal events, and interpersonal violence (e.g., assaults).

Information on DSM‐IV Criterion E (duration of at least one month) and F (clinically significant distress or impairment) were collected in four out of the ten studies. A sensitivity analysis within these studies found very high concordance between diagnoses determined by meeting DSM‐IV symptom criteria alone (i.e., criteria B through D) and those obtained using both the symptom criteria and the E and F criteria (sensitivity 0.92, specificity 1.00, Cohen's kappa=0.95). We consequently assumed PTSD diagnosis as present, across studies, based on meeting DSM‐IV PTSD symptom criteria alone.

The CAPS quantifies the frequency and severity of each of the seventeen DSM‐IV PTSD symptom criteria 73 by assigning to each symptom a 0‐4 incremental frequency score and a 0‐4 intensity score. A continuous measure of PTSD severity is obtained by adding all individual symptom scores (CAPS total score). A diagnosis of PTSD is determined using DSM‐IV PTSD diagnostic criteria of at least one re‐experiencing (Criterion B), three avoidance/numbing (Criterion C), and two hyperarousal (Criterion D) symptoms 73 . Following recommendations, a PTSD symptom was deemed “present” if its frequency score was 1 or more, and its intensity score was 2 or more 74 , 75 .

Study participants were included if they had an initial CAPS interview within 60 days of the traumatic event, and at least one follow‐up CAPS assessment 4 to 15 months (122 to 456 days) after trauma exposure. These criteria were met by 2,473 participants (Table 1 ). To maximize the utility of prediction, we used the earliest observation for individuals with two early (<60 days) assessments, and the latest observation for those with multiple assessments during follow‐up.

Using a previously described literature search strategy 63 , the ICPP IPD consisted of thirteen longitudinal acute‐care based studies of recent trauma survivors conducted in six countries. Investigators obtained informed consent using procedures approved by their local institutional review boards. Item‐level data from studies were shared, harmonized (see below) and combined into a pooled dataset. All ICPP studies used the DSM‐IV PTSD template to infer PTSD diagnosis and symptom severity. Included in this report are the ten studies 15 , 64 - 72 that used the repeatedly validated Clinician‐Administered PTSD Scale for DSM‐IV (CAPS) 73 , 74 .

Using data from participants whose last follow‐up assessment fell between 9 and 15 months from the traumatic event (N=1,359) to fit a CAPS 0 ‐only logistic regression yielded similar prediction probabilities (see dotted line in Figure 2 ), with similar model accuracy (Efron's R 2 =0.195, Brier score=0.071, AUC=0.822).

After accounting for the CAPS 0 effect, female participants were found to have a maximum of 5% (95% CI: –2% to 12%) higher risk for endpoint PTSD compared to male participants. Moreover, participants with all significant risk factors (i.e., female gender, less than secondary education, and exposure to prior interpersonal trauma) had a 34% (95% CI: 20‐48%) higher risk of PTSD compared to participants without any significant risk factors (i.e., male with secondary education and no prior interpersonal trauma). Estimated probabilities and 95% confidence intervals for endpoint PTSD based on each combination of the significant predictors are provided in Table 5 .

In the bootstrap analysis comparing the fixed effects logistic model with a random effects model using only CAPS 0 as a predictor, the E/O ratio and β overall from the fixed effects model (1.01 and 1.00, respectively) were closer to 1.00 than the random effects model (1.14 and 0.75, respectively), and the Brier score was lower on average for the fixed effects model (0.081, SD=0.01) than the random effects model (0.084, SD=0.01). Overall, the fixed effects model seems to estimate the likely number of participants with PTSD at follow‐up more accurately, with less heterogeneity or over‐fitting, than the random effects model, thereby supporting the pooling of participating studies.

With the inclusion of all risk indicators (full model) or that of significantly contributing factors (significant predictors model), accuracy remained high (respectively, smoothed probability correlation=0.941, Efron's R 2 =0.246, Brier score=0.078, AUC=0.855; and smoothed probability correlation=0.946, Efron's R 2 =0.246, Brier score=0.078, AUC=0.851). Thus, the addition of female gender, lifetime exposure to interpersonal violence, and less than a secondary education to the CAPS 0 model increased PTSD likelihood whilst keeping the CAPS 0 model's accuracy.

Predicted probabilities of endpoint PTSD conditional on initial (CAPS 0 ) severity scores. The dots represent the raw conditional probability of PTSD at follow‐up given the CAPS 0 score, smoothed with a kernel of width 5. The solid black line represents the logistic model predicted probability given the CAPS 0 score. The gray area is the 95% confidence interval for the prediction model. The dashed line represents the prediction function derived from participants with follow‐up observations later than 9 months. PTSD – post‐traumatic stress disorder, CAPS 0 – baseline score on Clinician‐Administered PTSD Scale for DSM‐IV.

The CAPS 0 model (plotted in Figure 2 along with its 95% confidence interval) fits well (Efron's R 2 =0.230, Brier score=0.080, AUC=0.847), with a very high correlation between the model's predicted probability and the smoothed estimate of conditional probability (r=0.976). Logistic regression using the full model showed that female gender (β=0.309, SE=0.151, p=0.041), having less than a secondary education (β=0.486, SE=0.188, p=0.009), and prior interpersonal trauma (β=0.662, SE=0.238, p=0.006) contributed significantly to the PTSD outcome.

The results from fixed effect models using CAPS 0 alone (CAPS 0 model), CAPS 0 plus all available predictors (full model), and CAPS 0 plus significant predictors only (significant predictors model) are presented in Table 4 .

The histogram in Figure 1 displays the number of participants who endorsed each CAPS 0 score, smoothed for a five points interval. As can be seen, the total number of participants declines progressively with increasing CAPS 0 scores. The CAPS 0 scores of participants with endpoint PTSD, however, span across the instrument's severity range, such that the proportion of those with endpoint PTSD increases with increasing CAPS 0 severity.

The prevalence of endpoint PTSD was 11.8% (N=291). Endpoint PTSD was significantly more frequent among female participants (16.4%, compared to 9.2% in males, p<0.001) and among participants who suffered interpersonal trauma compared to a motor vehicle accident or other traumatic events (respectively, 27%, 5% and 13%, p<0.001). No significant differences were observed by ethnicity, marital status, or age (see Table 3 ).

Participants’ average age at studies’ onset was 39.0±13.9 years. There were fewer female participants (37%) in the sample than males. Motor vehicle accidents (69%) were the most common index trauma, followed by other types of non‐interpersonal trauma (25%) and interpersonal trauma (6%). The median time to the initial assessment was 15±16.7 days (range 1‐60). The median time to the endpoint assessment was 333±103.1 days (range 122‐456).

DISCUSSION

The results of this study demonstrate that the probability of meeting PTSD diagnostic criteria 4 to 15 months after acute care admission is reliably modeled by a logistic function of initial PTSD symptom severity. Added to this model, female gender, having less than secondary education, and prior interpersonal trauma were associated with higher likelihood of endpoint PTSD. Other previously documented risk factors, such as age, marital status, and current trauma type, did not improve the prediction over the model that had CAPS 0 score as the only predictor. Importantly, the limited margin of error of the resulting risk estimate enables its clinical use to assess PTSD likelihood for each combination of the significant risk indicators.

The limited incremental effect of several known risk factors was an unexpected finding, suggesting that the contribution of these factors to PTSD likelihood is mediated by their effect on early symptom severity. In line with this view, a previous comparison of PTSD following terror attacks with PTSD following motor vehicle accidents from the same ED has shown that the higher prevalence of 4‐month PTSD following terror attacks (38% vs. 19%) was entirely accounted for by survivors’ early responses, that included one‐week PTSD symptoms, ED heart rate and peri‐traumatic dissociation61.

Our results extend previous findings of an association between high initial PTSD symptoms and being diagnosed with PTSD55-62 by highlighting the added informational value of likelihood estimates relative to predictive classification. The uniform distribution of PTSD participants initial CAPS 0 scores illustrates a barrier to classification models: trauma survivors who ultimately developed PTSD had their initial symptom severity distributed across the entire range of CAPS 0 total scores, thereby defying the use of a threshold separating future cases from non‐cases. Predicting who will develop PTSD, as much as predicting who among heavy smokers will develop lung cancer, is a difficult task, frequently replaced by likelihood estimates. Classification models have significantly informed our understanding of disorders’ etiology and pathogenesis81-86. Likelihood estimates, however, may be better suited for quantifying individual risk. As in other areas of medicine48-54, quantifying risk ultimately informs clinical action.

How can our results inform clinical action? Consider, for example, three female survivors with a CAPS 0 score of, respectively, 20, 40, 60; less than secondary education, and lifetime exposure to interpersonal violence. These individuals will have, respectively, 10.4% (95% CI: 7.0‐14.7), 24.1% (95% CI: 17.7‐31.7) and 46.6% (95% CI: 37.2‐56.4) likelihood of chronic PTSD. Male survivors with the same initial scores and no additional risk factors will have, respectively, 2.7% (95% CI: 1.8‐4.0), 7.1% (95% CI: 4.8‐10.1) and 17.3% (95% CI: 12.2‐23.4) likelihood of chronic PTSD. Individuals endorsing the highest CAPS 0 score, in both genders, might be seen as requiring clinical attention, e.g., an early intervention. The lower scores may justify a “watchful wait” with additional assessments.

A strength of this study follows from the use of data on a large number of participants from culturally and geographically diverse settings. Each included investigation utilized a longitudinal design, assessed PTSD symptoms shortly after index trauma, and based its appraisal of symptoms and diagnostic status on the repeatedly validated CAPS instrument.

In interpreting our findings, one should nonetheless consider some limitations. First, the time frame to determine PTSD status in our main analyses was 4‐15 months, thus very wide. However, when the data were restricted to participants re‐interviewed more than 9 months after the trauma, the resulting logistic prediction model remained essentially unchanged. Our prediction is nonetheless calibrated for the wider and earlier time bracket and centered on 333.0±103.1 days (less than a year) from trauma exposure.

Second, several risk predictors were harmonized due to the variety of instruments used by site investigators, which resulted in a loss of granularity. While those harmonized variables (less than secondary education, lifetime interpersonal trauma) have contributed to PTSD probability estimates, results involving recoded variables may miss important predictors’ information. Simplified predictors, however, might be easier to obtain in clinical practice and are widely used in predictive models in other areas of medicine (e.g., “smoking yes/no” and “diabetes yes/no” in the Framingham 10 years cardiovascular disease risk score).

Third, the ICPP data display considerable heterogeneity among contributing studies, which, as discussed above, raised methodological concerns about the best approach to pooling the data. We found that the fixed effects model was more accurate than the data source dependent random effects model and thus justified pooling from different studies. We also believe that a fixed effects model is more applicable to new environments, because a global slope and intercept were estimated across studies. Our choice, however, is neither beyond critique nor without significance: large multi‐source data compilations are currently evaluated in genetic, genomic and imaging research87, all of which have to contend with data source heterogeneity resembling the ICPP effort. Our theoretical premise that ICPP studies were differentially sampling subsets of an underlying population of reference (i.e., acute care trauma admissions) should be corroborated by testing the resulting risk assessment tool in newly admitted acute care trauma survivors.

The use of the CAPS structured clinical interview may add some burden on service delivery, and that interview is not properly a screening instrument. Moreover, several PTSD (i.e., CAPS) symptoms (e.g., insomnia, avoidance, inability to recall important aspects of the traumatic event) may not be present during ED admission. The early CAPS, nonetheless, is a robust risk indicator. Future work should explore earlier and simpler screening alternatives, or establish stepwise “screening and prediction” models, starting upon ED admission and predicting the likelihood of expressing high levels of early PTSD symptoms.

Finally, our model was developed using acute care trauma admissions, and as such its implementation in other traumatic circumstances (e.g., prolonged adversities such as wars, captivity and relocation) may require adjustments. Notwithstanding the precise risk estimates for other traumatic circumstances, we believe that early symptom severity has been convincingly shown here to be a major predictor of PTSD risk, and that, as such, its evaluation among individual survivors provides a valid warning and a call for action.

These limitations do not take away from the robustness of our likelihood estimates and their ability to support a personal risk assessment in individual survivors. Similar risk estimate tools are used in other medical domains to support clinical decisions (e.g., for determining breast48 or lung49, 50 cancer likelihood given risk indicators). The risk estimates provided in this work can be similarly used to trigger action (either watchful follow‐up or early intervention) according to local resources and the desirability of prevention.

Quantifying individual risk is a step forward in planning services and interventions, better targeting high‐risk individuals, and ultimately decreasing the burden of PTSD following acute care admission.