One of the larger problems facing the security industry in the era of mass terrorism is the task of creating a profile of a likely terrorist. Identifying those at risk of first time offenses is a challenge in any context, but the stakes are higher when that offense may also be the last, and involve the deaths of dozens of people. We've discussed the challenges of generating profiles of potential terrorists in the past, but a study that will be released by the Proceedings of the National Academies of Science does a mathematical analysis how we're deploying the profiles we do have, and suggests we may not be using them wisely.

The study was performed by William Press, who does bioinformatics research at the University of Texas, Austin, with a joint appointment at Los Alamos National Labs. His background in statistics is apparent in his ability to handle various mathematical formulae with aplomb, but he's apparently used to explaining his work to biologists, since the descriptions that surround those formulae make the general outlines of the paper fairly accessible.

Press starts by examining what could be viewed as an idealized situation, at least from the screening perspective: a single perpetrator living under an authoritarian government that has perfect records on its citizens. Applying a profile to those records should allow the government to rank those citizens in order of risk, and it can screen them one-by-one until it identifies the actual perpetrator. Those circumstances lead to a pretty rapid screening process, and they can be generalized out to a situation where there are multiple likely perpetrators.

Things go rapidly sour for this system, however, as soon as you have an imperfect profile. In that case, which is more likely to reflect reality, there's a finite chance that the screening process misses a likely security risk. Since it works its way through the list of individuals iteratively, it never goes back to rescreen someone that's made it through the first pass. The impact of this flaw grows rapidly as the ability to accurately match the profile to the data available on an individual gets worse. Since we've already said that making a profile is challenging, and we know that even authoritarian governments don't have perfect information on their citizens, this system is probably worse than random screening in the real world.

In the real world, of course, most of us aren't going through security checks run by authoritarian governments. In Press' phrasing, democracies resample with replacement, in that they don't keep records of who goes through careful security screening at places like airports, so people get placed back on the list to go through the screening process again. One consequence of this is that, since screening resources are never infinite, we can only resample a small subset of the total population at any given moment.

Press then examines the effect of what he terms a strong profiling strategy, one in which a limited set of screening resources is deployed solely based the risk probabilities identified through profiling. It turns out that this also works poorly as the population size goes up. "The reason that this strong profiling strategy is inefficient," Press writes, "is that, on average, it keeps retesting the same innocent individuals who happen to have large p j [risk profile match] values."

According to Press, the solution is something that's widely recognized by the statistics community: identify individuals for robust screening based on the square root of their risk value. That gives the profile some weight, but distributes the screening much more broadly through the population, and uses limited resources more effectively. It's so widely used in mathematical circles that Press concludes his paper by writing, "It seems peculiar that the method is not better known."

We're not privy to the exact details of various screening systems, so it's possible that the optimal solution is in use in a number of contexts. But, given that things like racial profiling are used in so many law enforcement contexts, from community policing to immigration, it's a safe bet that there are a fair number in which it's not. And, given that the use of profiles is frequently the subject of public debate, having a public that's informed of the limits of profiling could certainly help inform those debates.

PNAS, 2009. DOI: 10.1073/pnas.0813202106