Yep

I’ve disagreed with Judea Pearl before on causality, and I do so again below; but first some areas of agreement. Some deep agreement at that.

Pearl has a new book out (which I have not read yet) The Book of Why, which was the subject of an interview he gave at Quanta Magazine.

Q: “People are excited about the possibilities for AI. You’re not?”

As much as I look into what’s being done with deep learning, I see they’re all stuck there on the level of associations. Curve fitting. That sounds like sacrilege, to say that all the impressive achievements of deep learning amount to just fitting a curve to data. From the point of view of the mathematical hierarchy, no matter how skillfully you manipulate the data and what you read into the data when you manipulate it, it’s still a curve-fitting exercise, albeit complex and nontrivial.

This is true: perfectly true, inescapably true. It is more than that: it is tough-cookies true.

Fitting curves is all computers can ever do. Pearl doesn’t think accept that limitation, though, as we shall see.

Q: “When you share these ideas with people working in AI today, how do they react?”

AI is currently split. First, there are those who are intoxicated by the success of machine learning and deep learning and neural nets. They don’t understand what I’m talking about. They want to continue to fit curves. But when you talk to people who have done any work in AI outside statistical learning, they get it immediately. I have read several papers written in the past two months about the limitations of machine learning.

Don’t despair, Pearl, old fellow, I share your pain. I too have written many articles about the limitations of machine learning, AI, deep learning, et cetera.

Q: “Yet in your new book you describe yourself as an apostate in the AI community today. In what sense?”

In the sense that as soon as we developed tools that enabled machines to reason with uncertainty, I left the arena to pursue a more challenging task: reasoning with cause and effect. Many of my AI colleagues are still occupied with uncertainty. There are circles of research that continue to work on diagnosis without worrying about the causal aspects of the problem. All they want is to predict well and to diagnose well. I can give you an example. All the machine-learning work that we see today is conducted in diagnostic mode — say, labeling objects as “cat” or “tiger.” They don’t care about intervention; they just want to recognize an object and to predict how it’s going to evolve in time. I felt an apostate when I developed powerful tools for prediction and diagnosis knowing already that this is merely the tip of human intelligence. If we want machines to reason about interventions (“What if we ban cigarettes?”) and introspection (“What if I had finished high school?”), we must invoke causal models. Associations are not enough — and this is a mathematical fact, not opinion.

Associations, which are what statisticians would call correlations, are not enough, amen, but that’s more than just a mathematical fact. It is just plain true.

Q: “What are the prospects for having machines that share our intuition about cause and effect?”

We have to equip machines with a model of the environment. If a machine does not have a model of reality, you cannot expect the machine to behave intelligently in that reality. The first step, one that will take place in maybe 10 years, is that conceptual models of reality will be programmed by humans. The next step will be that machines will postulate such models on their own and will verify and refine them based on empirical evidence. That is what happened to science; we started with a geocentric model, with circles and epicycles, and ended up with a heliocentric model with its ellipses. Robots, too, will communicate with each other and will translate this hypothetical world, this wild world, of metaphorical models.

Now I do not know exactly what Pearl has in mind with his “model of the environment” and “model of reality”, since I haven’t yet read the book. But if it’s just a list of associations (however complex) which are labeled, by some man, as “cause” and “effect”, then it is equivalent to a paper dictionary. The book doesn’t know it’s speaking about a cause, it just prints what it was told by an entity that does know (where I use that word in its full philosophical sense). The computer can be programmed to move from these to identifying associations consonant with this dictionary, but this is nothing more than advanced curve fitting. The computer has not learned about cause and effect. The computer hasn’t learned anything. It is mindless box, an electronic abacus incapable of knowing or learning.

This is why I disagree with Pearl again, when he later says “We’re going to have robots with free will, absolutely. We have to understand how to program them and what we gain out of it. For some reason, evolution has found this sensation of free will to be computationally desirable.” Evolution hasn’t found a thing, and you cannot have a feeling of free will without having free will: it is impossible. Robots, being mindless machines, can thus never have free will, because we will never figure a way to program minds into machines, and minds are needed for feelings. Why? For the very good reason that minds are not material.

Nope

Not coincidentally, in the theological teleological sense of the word, Ed Feser has a new speech on the immateriality of the mind which should be viewed (about 45 minutes). I will take that speech as read—and not as the quale red, which is a great joke. Meaning I’m not going to defend that concept here: Feser has already done it. I’m accepting it as a premise here.

The mind isn’t made of stuff. It is not the brain. Man, paraphrasing Feser, is one substance and the intellect is one power among the many other physical powers we possess. (This is not Descartes’s dualism.) The mind is not a computer. It is much more than that. Computers are nothing more than abacuses, and there’s nothing exciting about that.

Man is a rational creature, and his mind is not material. Rationality is (among other things) the capacity to grasp universals (which are also immaterial). Cause is a universal. Cause is understood in the mind. (How we learn has been answered, as I’m sure you know as you’ve been following along, in our recent review of chapters in Summa Contra Gentiles.) Causes exist, of course, and make things happen, but our knowledge of them is an extraction from data, as the extraction of any universal is. Cause doesn’t exist “in” data. We can see data, and we move from it to knowledge of cause. But no algorithm can do this, because algorithms never know anything, and in particular no algorithm engineered to work on any material thing, like a computer, like even a quantum computer, can know anything. (Yes, we make mistakes, but that does not mean we always do.)

This means we will never be able to build any machine that does more than curve fitting. We can teach a computer to watch as we toss people off of rooftops and watch them go splat, and then ask the computer what will happen to the next person tossed off the roof. If we have taught this computer well, it will say splat. But the computer does not know what it is talking about. It has fit a curve and that is that. It doesn’t know the difference between a person and a carrot: it doesn’t know what splat means: it doesn’t know anything.

We can program this mindless device to announce, “Whenever the correlation exceeds some level, as it is in the tossed-splat example, print on the screen a cause was discovered.” Computers are good at finding these kinds of correlations in data, and they worked tirelessly. But a cause has not been discovered. Merely a correlation. Otherwise all the correlations listed at Spurious Correlations would be declared causes.

Saying computers can discover a universal like cause is equivalent in operation to hypothesis testing, though not necessarily with a p-value. If the criterion to say “cause” isn’t a p-value, it has to be something, some criterion that says, “Okay, before not-cause, now cause.” It doesn’t matter what it is, so if you think it’s not a p-value, swap out “p-value” below with what you have in mind (“in mind”—get it?). In the upcoming peer-reviewed paper (and therefore perfectly true and indisputable) “Everything Wrong With P-Values Under One Roof” (to be published in a Springer volume in January: watch for the announcement), I wrote:

In any given set of data, with some parameterized model, its p-value are assumed true, and thus the decisions based upon them sound. Theory insists on this. The decisions “work”, whether the p-value is wee or not wee. Suppose a wee p-value. The null is rejected, and the “link” between the measure and the observable is taken as proved, or supported, or believable, or whatever it is “significance” means. We are then directed to act as if the hypothesis is true. Thus if it is shown that per capita cheese consumption and the number of people who died tangled in their bed sheets are “linked” via a wee p, we are to believe this. And we are to believe all of the links found at the humorous web site Spurious Correlations, \cite{vigen_2018}. I should note that we can either accept that grief of loved ones strangulated in their beds drives increased cheese eating, or that cheese eating causes sheet strangulation. This is joke, but also a valid criticism. The direction of causal link is not mandated by the p-value, which is odd. That means the direction comes from outside the hypothesis test itself. Direction is thus (always) a form of prior information…

I go on to say that direction is a kind of prior information forbidden in frequentist theory. But direction is also not something in the data. It is something we extract, as part of the cause.

I have no doubt that our algorithms will fit curves better, though where free will is involved, or the system is physically complex, we will bump up against impossibility sooner or later, as we do in quantum mechanical events. I have perfect certainty that no computer will ever have a mind, because having a mind requires the cooperation of God. Computers may well pass the Turing test, but this is no feat. Even bureaucrats can pass it.

Share this: Facebook

Reddit

Twitter

Pinterest

Email

More

Tumblr

LinkedIn



WhatsApp

Print



