More than 20% of children in the United States will experience a traumatic event before they are 16 years old, but some will go on to develop Post-traumatic Stress Disorder (PTSD). How can we know which child is at risk for PTSD so that it can be prevented? An article published today in BMC Psychiatry , is the first of its kind to use machine learning to identify risk factors for childhood PTSD. In this blog, author Glenn Saxe tells us more about his research

Tej3478 via Wikimedia Commons, CC

Of children who are exposed to a trauma, between 10 and 40% will develop Posttraumatic Stress Disorder (PTSD). PTSD is a debilitating psychiatric condition that can have significant impact on a child’s functioning and, perhaps, even the development of their brain. There is also a growing literature indicating that PTSD may be prevented, if a child at risk (i.e. the 10-40%) is identified early enough. How can we know which child is at risk for PTSD – as early as possible after trauma exposure – so that this risk may be mitigated? What does the literature say about our ability to predict a child’s risk?

A definitive account about the state of this literature was published a few years ago. Trickey and colleagues conducted a meta-analysis of 64 studies on child PTSD risk published over the previous two decades, and found that the effect sizes for all but a handful of risk factors were very small: and not many more than a handful of risk variables (25) had even been studied, more than once. I have been studying risk for child PTSD since the beginning of the two-decade time span of this landmark meta-analysis, and I was not surprised by the discouraging results, reported. PTSD is a complex phenomenon. In all likelihood, its’ expression involves the complex interactions between a wide diversity of biopsychosocial processes, that engage over time – in specific ways – under specific conditions.

I have been studying risk for child PTSD since the beginning of the two-decade time span of this landmark meta-analysis, and I was not surprised by the discouraging results, reported

Several years before this meta-analysis was published, I became convinced that the data methods conventionally used for PTSD risk factor research could not approach the complexity of the subject under study. I began to explore the application of new methods such as Machine Learning predictive classification, non-experimental causal discovery, and complex systems science. These methods were gaining tremendous traction in other fields for the very problems that seemed to plague research in my field. The study published today in BMC Psychiatry entitled “Machine Learning Methods to Predict Child Posttraumatic Stress: A Proof of Concept Study” is an outcome of this effort, and reports the first ever application of Machine Learning to predict childhood PTSD.

This article reports on a longitudinal study of children assessed in the wake of a trauma (hospitalization for an acute injury) to predict who would acquire PTSD from information available around the time of trauma. We assessed a wide variety of risk factors while the child was in the hospital including candidate genes, neurochemical and neurophysiologic responses, hospital treatment, qualities of the child’s family and early development, and parent’s stress and mental health. We then applied state-of-the-art Machine Learning methods to use all the risk information we collected, to build models of risk for PTSD, and test these models for their reliability and accuracy.

What is Machine Learning? Details are found in our article but – broadly and briefly – Machine Learning is a computational approach designed to find patterns in data that will form reliable and accurate predictive models of specified events (e.g. the occurrence of PTSD in a traumatized child). The ‘machine’ will use specified algorithms to search through the space of possibilities contained in the data, to arrive at a predictive model. The reliability and accuracy of this predictive model is then tested with data the ‘machine’ has not yet seen.

We used five different approaches to Machine Learning predictive classification and each of the yielded models showed strong predictive performance, certainly stronger than comparisons with conventional methods. We also integrated a feature selection method designed to find the variables that had causal influence on PTSD. We understand that this claim for causality will often arouse incredulity within the psychiatric research community. How can we make such a claim?

We understand that this claim for causality will often arouse incredulity within the psychiatric research community. How can we make such a claim?

Our causal discovery, feature selection methods are designed to find what is called the Markov Boundary for a response variable (R) (e.g. PTSD). The Markov Boundary defines the set of variables that render every other variable set in the data independent of R. The Markov Boundary – in the majority of distributions – also possesses a causal function as it includes the direct causes, direct effects and direct causes of the direct effects of R. When the response variable is a terminal event (i.e. no variables occur after R), then the Markov Boundary is the set of direct causes of R. What are the implications of finding such a Markov Boundary?

Our methods have enabled the discovery of the Markov Boundary for the terminal response variable of PTSD. Therefore, we have discovered the set of direct causes of PTSD. Some of you may be tempted to search through our article to find the method we used to conduct the experiment that allowed such causal inference. Alas, none will be found, and that is the point. Should we have randomly assigned a child to a trauma condition? Of course, ethical considerations cannot allow such a study and, therefore, there is no ethically allowable search for causation using conventional methods in our field. And, therefore, there is no available knowledge of causation (outside of animal research) in our field. And this is a very big problem.

In our article, we include many references documenting the mathematical rigor of Markov Boundary induction, and the track record of Markov Boundary induction algorithms for detecting true causes, from data sets where true causes were previously known.

What did we find? We found a set of causal variables, measured around the time of trauma that predicted PTSD, three months later. In this set, were specific candidate genes that may point to the potential preventative use of specific medications targeted towards the biological processes encoded by these genes. Other identified causal variables were more directly remediable, such as the child’s level of acute pain and the parent’s level of acute distress. Interestingly, we also found that the child’s history of being breast fed as an infant, and whether the child attended religious services regularly, was protective of PTSD: indicating the important role of attachment, community, and spirituality for a child’s recovery after trauma.