I’ve often wondered whether future varieties of artificial intelligence might not help humans discover incredible new things about reality. Perhaps machines will even unveil little nuances that have been hidden beneath our noses all along, but which required the kind of critiques that only a computer mind could offer.

But while we are thinking about the ways ever-watchful A.I. may help us critique reality in the future, another question comes to mind which mirrors the old idea, “quis custodiet ipsos custodes?” proposed by Roman satirist Decimus Junius Juvenalis: in a world where we’re learning from the machines we’ve built, who fact-checks those machines, and more importantly, how do we know that what we learn from them even be trusted?

This might sound like a silly proposition, due in part to the sorts of stigmas we have about artificial intelligence and machine learning. On the one hand, many would ascribe to notions like, “of course we shouldn’t trust machines completely… they aren’t human!” In direct contrast to this, advocates of machine learning and A.I. might argue that this is precisely the reason they should be trusted: machines are, and will continue to be, capable of processing and synthesizing information in ways that often far exceed what the human brain can do.

Genevera Allen, a Rice University statistician recently made the argument that until machine learning systems can be designed in ways that allow for objective critiques of the information they provide, their trustworthiness remains flawed.

Allen spoke earlier in February at an Annual Meeting of the American Association for the Advancement of Science, where she addressed this problem as part of her lecture:

“The question is, ‘Can we really trust the discoveries that are currently being made using machine-learning techniques applied to large data sets?’ The answer in many situations is probably, ‘Not without checking,’ but work is underway on next-generation machine-learning systems that will assess the uncertainty and reproducibility of their predictions.”

Fundamental to the problem is the predictive nature of many (if not most) computational systems. Allen argues that because making a prediction about information is what these computers are designed to do, they are unlikely not to find data when tasked with doing so… even if a human observing the same data could easily discern that, in some instances, none exists.

“[Machines] never come back with ‘I don’t know,’ or ‘I didn’t discover anything,’ because they aren’t made to,” Allen said in a Rice University press release. However, the concerns Allen raises aren’t merely prospective: some instances already exist where it appears that questionable information may have been gleaned, at least in part, from studies that incorporated computational data that remained uncorroborated with relation to cancer research.

Allen explains:

“[T]here are cases where discoveries aren’t reproducible; the clusters discovered in one study are completely different than the clusters found in another… Because most machine-learning techniques today always say, ‘I found a group.’ Sometimes, it would be far more useful if they said, ‘I think some of these are really grouped together, but I’m uncertain about these others.'”

I recall a discussion I had with a futurist a few months ago, where we discussed the question over dangers stemming from interpretive issues that may arise from the way machine learning differs from our own. The example he gave was relevant here, although more in line with the “Hollywood” conception of potential problems we may one day have with A.I. (we might call this the “Terminator” model, which I think needs no further explanation here).

Imagine, he said, if we told A.I. to find a way to destroy a particular human disease. We input the information, and the computer outputs the following solution: “destroy all carriers of the disease.” In other words, rather than finding a cure, the machine interprets the problem in simple terms of its elimination… whereby the machine makes no distinction between “solving” the problem and committing murder.

This is a dramatic example, but it works analogically in comparison with what we already see in studies where machine learning is involved. Computers, in other words, function and respond to data sets in ways that differ vastly from human reason. Thus, we do need to be aware of these kinds of issues as more and more computational systems are affecting the knowledge science is acquiring and working with in the future.