Photo: Doug McSchooler/AP/Corbis

Parents, players, and coaches are much more aware of the seriousness of sports concussions than they used to be. This is good news — not only are concussions a damaging form of brain injury in their own right, but for a time they leave the brain extra vulnerable to further knocks. That’s why it’s so important that concussions are detected and that concussed players are rested before returning to the action.

Claiming to meet this need, an abundance of computerized sports concussion tests — known formally as “computerized neurocognitive assessment tools” — have popped up, and many of them have shortened mobile-app versions. For example, ImPACT, one of the market leaders, says that it “provides trained clinicians with neurocognitive assessment tools and services that have been medically accepted as state-of-the-art best practices — as part of determining safe return to play decisions.”

The idea behind these computerized tests is that players’ mental performance (things like memory, reaction time, and impulse control) can be tested when they’re healthy and uninjured, and then if or when they suffer a blow to the head, they can retake the test to see if their mental performance is in any way impaired relative to their own baseline. If it is, this would be taken as a sign of concussion. Post-injury, the test can then be retaken periodically, over days and weeks, to look for signs of improvement and help make judgments about when it is safe for the athlete to return to training and playing again.

That’s the theory, but do these tests really work as well as they claim? And is it safe to use these tests as a substitute for in-depth assessment and supervision from a trained physician? According to a study published recently in the Journal of the International Neuropsychological Society, these tests can provide a useful way to help detect cognitive impairments over the first 24 hours after an injury, but they’re not accurate enough to replace professional clinical evaluation — and beyond 24 hours, their accuracy really deteriorates, which means they can’t be relied on as a way to measure concussion recovery.

Lindsay Nelson at the Medical College of Wisconsin and her colleagues road-tested three of the full versions of the most popular computerized concussion tests: ANAM from Vista Life Sciences, Axon from Axon Sports, and ImPACT from ImPACT Applications Inc.

Between 2012 and 2014, the researchers recruited thousands of “contact and collision sport” athletes from high schools and colleges in Wisconsin, and had each of them complete two of the three concussion tests. These initial scores served as their baseline performance on the tests. During the course of the study, 166 of these athletes suffered a concussion (as judged by comprehensive medical interviews and symptom checklists), and a further 166 uninjured athletes, matched for school, sports, and gender, were chosen to act as a control group.

One key to understanding the effectiveness of these tools was to see how consistent they were in their assessments of the uninjured test subjects (in psychometrics, this is known as a test’s “reliability,” as opposed to its “validity,” which refers to whether it’s measuring what it claims to be measuring). For a test to be considered reliable, then in the absence of injury, a person’s scores on it should remain fairly stable across repeated testing, rather than being overly swayed by things like a person’s current tiredness and/or motivation. If, on the other hand, a concussion test offers hugely different readings of the same individual at different times, then it’s functionally worthless because it means you can’t tell if any change in scores is due to concussion or to more mundane reasons, like a bored or distracted athlete.

Members of the concussed group were retested on the apps roughly 24 hours after their injury, and then again roughly 8, 15, 45, and 128 days later. Members of the control group retook the tests soon after they were selected for the study, and then they took the tests again after 7, 14, 44, and 128 days.

The tests didn’t fare too well — the researchers said they showed only “modest” reliability and “generally lower than is considered needed to contribute meaningfully to clinical decisions.” This is bad news for users of the tests, not to mention the concussed athletes being evaluated.

Another important measure of a test’s quality is its “sensitivity” — that is, how good it is at detecting real concussions and avoiding false positives. To check this, one thing the researchers did was look and see whether the concussed athletes showed bigger changes in test performance compared with the control athletes. In theory, the researchers figured, their performance should take a hit after the concussion and then, over time, return to the baseline.

At the 24-hour post-injury test, the concussed group did show greater changes relative to baseline than the control group, but still the researchers said the size of this difference only “translated to fair to poor discrimination between groups” — again, not great news for the computerized tests. On testing eight days after injury and beyond, there was little difference in the test performance between the concussed and control groups compared with their own baselines (even though roughly 35 percent, 15 percent, and 1.5 percent of concussed athletes were still complaining of concussion-related symptoms such as headache and dizziness in their clinical interviews at day 8, day 15, and day 45 post-injury, respectively). Past research found that the cognitive impairments associated with concussion take on average five to seven days to resolve, but that around 10 percent of concussed athletes take longer than a week to recover. The new results suggest that computerized tests are unlikely to be useful for identifying this slow-to-recover minority.

Finally, what about using the tests’ own published thresholds for determining signs of impairment? The greatest sensitivity here was shown by the ImPACT test at 24 hours post-injury — it classified nearly 68 percent of concussed athletes as showing one or more deficits in their mental performance, compared with 60 percent, according to the Axon test, and 48 percent according to ANAM.

This is certainly better news for the computerized tests and suggests they could have some use, especially in the first 24 hours after an injury. The trouble is, the tests also categorized a large portion of the healthy control athletes (between 25 and 30 percent across the three apps) as showing signs of impairment on their first retest compared with baseline. And by eight days post-injury, the proportion of athletes flagged as still showing cognitive impairments (between 31 and 49 percent across the three apps) was similar to the proportion of controls classified in the exact same way (between 29 and 39 percent).

Overall, Nelson and her team said their results were “consistent with the current consensus within the broader community that, although neurocognitive tests can contribute to the overall clinical picture, they should not be considered in isolation or favored over multidimensional clinical assessment approaches.” Bear in mind that this was their conclusion for the full version of these computerized tests — there’s every reason to believe the shortened, mobile versions are even less trustworthy.

Maybe in the future these programs will grow more sophisticated and robust. For now, though, it seems it’s fair to say that none of them should be used as some kind of substitute for the judgment and care of properly trained health professionals.

Dr. Christian Jarrett (@Psych_Writer), a Science of Us contributing writer, is editor of the British Psychological Society’s Research Digest blog. His latest book is Great Myths of the Brain.