How does your brain generate accurate perceptual experiences? How does it initiate action? How does it do virtually everything else it does? Jakob Hohwy's book provides an ambitious, controversial answer. He argues that one mechanism explains everything the brain does, from "perception [to] action and everything mental in between" (p. 1).

The mechanism aims at prediction error minimization. In Hohwy's view, the brain constructs a model of the world containing hypotheses about objects and properties of all kinds. To test the model, the brain predicts the sensory inputs it's likely to receive if the model is accurate. Then it compares the predicted inputs with the inputs it actually receives. If there's a match between the predicted and actual inputs, the model is confirmed. If there's a mismatch, prediction error occurs. To minimize prediction error and improve the model's accuracy, the brain revises the model, generates new predictions, and tests them against subsequent inputs.

Hohwy isn't the first to develop the prediction error minimization framework, but, as far as I know, he's the first to write a monograph developing it. Key aspects of the framework emerged in machine learning research by Geoff Hinton and colleagues from the 1980s onward. Using prediction error minimization-style processing, these researchers have achieved promising results for perceptual recognition by computer networks (Le 2013). The results suggest that the mechanism can be implemented by actual neural systems, including human brains. The framework has also been developed in biophysics and computational neuroscience by Karl Friston and collaborators. Friston argues for prediction error minimization on the basis of principles in statistical physics. The relevant principles are alleged to explain how biological systems remain organized and resist the thermodynamic tendency toward disorder (entropy). Hohwy acknowledges this literature's influence on his thinking. But the literature is quite technical, and some is highly controversial. Hohwy doesn't cover it in detail. Instead, he aims to make the framework more accessible, developing and applying it in novel ways for a philosophical audience.

Hohwy's book has three parts. In part 1, he introduces the framework and its explanation of perception and action. In parts 2 and 3, he applies the framework to many further topics, including perceptual binding, delusion, autism, cognitive penetration, emotion, the unity of consciousness, and others.

This review is organized as follows. Section 1 explains how the prediction error minimization mechanism applies to perceptual processing. Section 2 explains the hierarchical mental architecture in which the mechanism operates on Hohwy's view. Section 3 addresses Hohwy's proposed explanation of action. Section 4 covers two further applied topics: autism and cognitive penetration. Section 5 concludes. Along the way, I raise questions about the framework's prospects as a grand, unified theory of how the brain works.

1. Perception and the core mechanism

Hohwy introduces the prediction error minimization (PEM) mechanism in an account of perceptual processing. Suppose you're holding a parakeet. You feel its talons. You see its colors. You hear its chirp. How do you arrive at an accurate perceptual experience of the bird? First, your perceptual system receives inputs to different sensory surfaces: pressures on the hand, photons on the retinas, etc. The inputs are converted into sensory signals that travel into the brain along neural pathways. But the brain still has problems to solve before generating accurate experiences. Let's focus on two problems: underdetermination and binding. We can use them to illustrate what's attractive and unique about PEM.

The first problem concerns underdetermination (Ch 1). In vision, images on the retinas could be caused by indefinitely many environmental layouts. For example, the parakeet images could result from a photo, a hologram, or birds of various shapes, sizes, and distances from the observer, among other things. Perception scientists want a tractable, reliable mechanism that explains how the brain figures out the distal causes of the input.

Like others working in the PEM framework, Hohwy holds that the brain's solution approximates inferences according to Bayes' theorem (pp. 41-46). Prior to receiving some sensory signal, the perceptual system assigns probabilities to the world's being various ways (Bayesian priors). The brain also assigns probabilities concerning which sensory signal will be received given the world's being those ways (Bayesian likelihoods). Using these probability assignments, the brain infers which objects and properties in the environment likely caused the sensory signals. It then represents these objects and properties in perceptual experiences.

At least two things set apart Hohwy's development of the PEM framework from other Bayesian proposals. First, many Bayesian proposals have narrow explanatory aims; they cover only small sets of perceptual processes, such as face recognition or motion detection. By contrast, the PEM framework aims to explain all mental processes. Second, many Bayesian proposals don't specify mechanisms to carry out the relevant inferences; they say only that such mechanisms exist. By contrast, proposing a tractable mechanism is PEM's central contribution, though it remains controversial.

The second problem in perceptual processing concerns binding (Ch 5). Binding can occur within a sensory modality, e.g., when colors, textures, and shapes are represented as properties of one, visible parakeet. Binding can also occur across modalities, e.g., when colors and sounds are attributed to one bird by vision and hearing, respectively. But how does the brain bind these features into coherent experiences?

On a "bottom-up" approach, which Hohwy opposes, binding occurs in separate processes following feature detection. The perceptual system first extracts information about the features of objects, such as colors and shapes, from the sensory signals. It then binds these features as properties of one object.

The PEM framework turns the above approach on its head. According to PEM, the brain first constructs coherent hypotheses that bind relevant features. Then it tests the hypotheses against the sensory signals. In the bird example, your brain hypothesizes that you're holding a bird with certain talons, colors, and chirps. Then it tests those hypotheses by comparing predictions about the sensory signals with the actual signals. PEM proposes an elegant solution to the binding problem, because it requires only one mechanism for binding and feature detection.

2. Hierarchy and precision expectation

According to the framework, the prediction error minimization mechanism is made tractable by implementing it across hierarchical levels of processing. At each level of the hierarchy, states predict features of the sensory signal at the level immediately below. When the brain correctly anticipates aspects of the signal, those aspects of the signal are "explained away". When the brain doesn't anticipate aspects of the signal, prediction error occurs. These unanticipated prediction errors are the remains of the sensory signal as it's sent up through the neural pathways of the hierarchy. They constitute what's left of the sensory signal for higher levels to explain away.

Exactly which contents are represented by hypotheses at each level is never made clear in the book. And Hohwy doesn't work through a detailed, realistic example of processing for an experience. However, he does describe the hierarchical ordering in two schematic ways. One concerns computational distance from the sensory surfaces (p. 32). Another concerns spatiotemporal scales of causal regularities represented at each level (p. 30). We can paraphrase the orderings thus:

A. Level Ln is higher than Lm iff computations performed at Ln are farther from the sensory surfaces in the computational sequence than computations performed at Lm.

B. Level Ln is higher than Lm iff the content expressed by hypotheses at Ln are about larger spatiotemporal scales than hypotheses at Lm.

The ordering in A follows from core claims about the mechanism. States at Ln predict the sensory signal at Lm. Lm passes prediction error up to Ln. So Ln is farther from the sensory surfaces than Lm. Hohwy motivates B as follows: the deep causal structure of the world is ordered by spatiotemporal regularities -- patterns over different lengths of time and regions in space -- and the brain recapitulates that structure (pp. 27-28).

To illustrate: suppose you see a bouncing ball. Hohwy argues that states at lower levels hypothesize small, short-term regularities in, for example, the shadows on the ball and precise location of segments of the ball's edges. Low levels of the model predict and test the sensory signal with respect to these quickly changing patterns. States at higher levels hypothesize the persistence of the ball through the bounces, predicting and testing the sensory signal for properties instantiated at larger spatiotemporal scales.

As the sensory signals move up the hierarchy, noise often complicates the process. Noise involves irregularities and failures of correspondence between the sensory signal and target properties in the world. Noise arises from at least two sources. Externally, it arises from poor environmental conditions, as when you're viewing objects in fog. Internally, noise arises from variability in neural activation, as when neurons fire differently on two occasions, despite the same relevant initial conditions.

According to Hohwy, the brain accounts for noise using "expected precisions". In addition to predicting the sensory signal, the brain estimates the signal's precision. Precision is inversely proportional to noise, so the brain in effect predicts how noisy the signal is at each level of the hierarchy. To generate accurate perceptual experiences when the brain expects a noisy signal, the brain doesn't treat the signal as confirming or disconfirming its prior hypotheses as strongly as it would for a precise signal.

I should now explain a worry I have about Hohwy's proposed hierarchical orderings. It seems to me that A and B cannot both be true, given PEM's commitments. On PEM, a person's beliefs are in the hierarchy (p. 126). Person-level beliefs can be computationally far from the sensory surfaces and can also represent things at small spatiotemporal scales. This suggests the orderings may conflict.

For example, suppose Sam believes that a photon will escape a micro-cavity in a nanosecond due to environmental fluctuations. Later, Sam visually experiences a ball bouncing. The experience concerns regularities at larger spatiotemporal scales than the belief: balls are larger than photons; bounces are longer than a nanosecond. So, by B, Sam's belief is lower in the hierarchy than the visual experience. But the belief is sophisticated and farther from the sensory surfaces than the experience. So, by A, Sam's belief is higher than the experience. So, A and B conflict. Claim A follows from the core PEM framework. So, the ordering proposed in B is false.

To account for my worry, one could relax the ordering, allowing for counter-instances. However, the problem generalizes for numerous beliefs, so relaxing may not help. Alternatively, one could exclude beliefs from the hierarchy. However, this would seriously undermine Hohwy's ambition to explain all mental processes with one mechanism. Finally, one could deny that higher level-hypotheses represent contents at larger spatiotemporal scales, while retaining the idea that higher-level states are sensitive to causal regularities at larger spatiotemporal scales. This approach has some support in the machine learning literature on perceptual processing, but it has yet to be worked out for PEM as a unified theory of all mental states and processes.

3. Action

So far we've seen one way to minimize prediction error: by revising hypotheses and generating new predictions. Hohwy calls this "perceptual inference". There's another way to minimize prediction error, which Hohwy calls "active inference". In active inference, the brain alters the sensory inputs it receives by initiating bodily action in hopes of yielding sensory signals that conform to its predictions.

To illustrate: suppose an agent wants to raise her arm. Her brain predicts sensory signals associated with a raised arm. If her arm is in fact lowered, prediction error occurs. But if the agent raises her arm and the predicted sensory signals are received, prediction error is minimized.

The claim that perception and action rely on the same mechanism is an attractive unifying feature of the PEM framework. But Hohwy acknowledges a puzzle here. If one mechanism causes both perceptual and active inference, it's unclear what triggers one rather than the other. We need to explain why the brain sometimes minimizes prediction error through revising its predictions and other times through action.

Hohwy offers a tentative solution to the puzzle: expecting an especially precise signal indicating a desired state can initiate action to bring that state about. Hohwy writes,

action ensues if the counterfactual proprioceptive input is expected to be more precise than actual proprioceptive input, that is, if the precision weighted gain is turned down on the actual input. This attenuates the current state and throws the system into active inference. (p. 83)

Here's a gloss on Hohwy's proposal. The brain forms an expectation of what the proprioceptive signal would be if the arm were raised. The brain predicts that the signal would be precise, hence a good guide to the world. Meanwhile, the brain reduces its expectation for precision in the actual signal from the lowered arm, treating it as noise. The relatively high, expected precision regarding the counterfactual state initiates action to bring about that state.

If that's the proposal, however, I don't see how it solves the puzzle. First, I don't see how expectations concerning a counterfactual state launch the arm into action. If the predicted signal is treated as counterfactual, the prediction doesn't take a stand on the arm's actual position. An actual signal and a prediction about a counterfactual signal cannot conflict. So the proposal doesn't identify any prediction error to minimize.

Second, suppose the proposal is modified: the action-inducing prediction is that the arm is actually raised. Here there may be prediction error, unless the arm is raised. Still, this doesn't explain why action occurs, rather than hypothesis revision. For all the proposal says, the prediction could be of a precise, raised-arm signal that never materializes because the arm is never raised. In that case, the raised-arm prediction could be revised without action.

There's a larger worry here. Hohwy claims that the PEM mechanism explains all mental processes. The puzzle about what initiates action rather than prediction revision illustrates the potential explanatory limitations of the mechanism. If some further mechanism is required to solve the puzzle, PEM's explanatory ambitions are undermined.

4. Applications

I now turn to Hohwy's applications of the framework in parts 2 and 3. Hohwy covers many topics, which allows him to develop a novel argument for the prediction error minimization framework by appealing to its explanatory scope. The book's central argument is that prediction error minimization can explain all mental phenomena at least as well as alternative accounts and with more unification than any other theory. Space constraints limit me to two applications -- autism and cognitive penetration -- each of which connects with larger issues.

4.1 Autism

Individuals with Autism Spectrum Disorder (ASD) exhibit perceptual symptoms (e.g., sensitivity to lights) and social-cognitive symptoms (e.g., difficulty identifying others' mental states). Hohwy argues that perceivers with ASD expect precise sensory signals and little noise in perceptual processing. In generating perceptual experiences, they heavily weight bottom-up sensory signals, giving relatively less weight to prior expectations than perceivers without ASD. Hohwy adds, "An individual who expects more precision than most others will be more guided by particular sensory input and less tempted to generalize that input under previously learned more general regularities" (p. 162). Since social cognition relies on inferences from perceptual cues, and individuals with ASD don't generalize from particular perceptual representations as effectively as others, they also show social deficiencies.

Hohwy's proposed explanation for ASD has a promising structure. The explanation suggests that perceptual and social-cognitive processing in individuals with ASD is driven more by bottom-up information and less by top-down influence from expectations than for others. However, I wonder how much support the application provides for the PEM mechanism. Other explanations of ASD share the structure of Hohwy's explanation but don't appeal to the PEM mechanism (e.g., Pellicano and Burr 2012). Hohwy discusses a related proposal by Qian and Lipkin (2011) in terms of neuronal tuning functions, which describe how populations of neurons respond to particular stimuli. Using terms that may be unfamiliar to some readers, Hohwy writes,

Their main hypothesis is that typically developed brains are biased towards generalist, interpolation learning (broad tuning curves) and that the brain of an autistic individual is biased towards particularist, lookup table learning (narrow tuning curves). I think this difference in learning bias would be nicely understood in terms of a difference in precision optimization (p. 162).

Abstracting from the details and technical terms, Hohwy suggests that Qian and Lipkin's explanation of ASD fits well with the prediction error minimization framework. However, their explanation doesn't require PEM. My worry is this: although Qian and Lipkin's account can be paraphrased in terms of the PEM framework, it needn't be. Likewise, the essential features of Hohwy's explanation of ASD can be put in terms of prediction error minimization, but needn't be. The crucial elements of the explanation appeal to differences in bottom-up and top-down processing, which are neutral on whether the prediction error minimization mechanism is involved. The application of the PEM mechanism fills in various details, but doesn't seem to do crucial work in Hohwy's explanation of ASD.

Perhaps PEM can claim explanatory advantages with its unified account of perceptual and social-cognitive deficits in ASD, because the PEM framework spans these different mental processes. However, more would need to be said along these lines than Hohwy says, since the claim that perceptual deficits may be linked to these other deficits isn't unique to PEM.

In general, we should distinguish evidence for explanations presented in PEM terminology from evidence for explanations that require the prediction error minimization mechanism. The explanatory argument for PEM could be strengthened by showing that PEM does crucial work in explaining the wide range of phenomena Hohwy discusses, not only that such explanations are compatible with PEM.

4.2 Cognitive penetration

Hohwy's development of the PEM framework entails that there's no theoretical or anatomical boundary preventing cognitive states from influencing perceptual processing (p. 122). The hierarchy seamlessly spans perception and cognition. This feature distinguishes Hohwy's view from many others in cognitive science.

Cognitive penetrability and impenetrability are of central interest here. Hohwy works with a definition of cognitive penetration similar to others in the literature. Roughly, cognitive penetration occurs when cognitive states influence perceptual processing not solely by causing shifts in where the subject attends in the perceptual field or in their sensory inputs.

Hohwy argues that cognitive penetration tends to occur when uncertainty increases at lower levels of perceptual processing. In that case, prediction error minimization relies more on prior expectations (including person-level beliefs) and treats the sensory signal as more noisy.

Cognitive impenetrability provides test cases for the PEM framework as Hohwy develops it. If there's no boundary to influence by cognitive states, why are some perceptual representations recalcitrant despite knowledge to the contrary? For example, why do the Müller-Lyer lines appear to have different lengths even when we know they don't?

Hohwy offers two lines of response. On one, he cites the limited information the perceptual system receives when viewing Müller-Lyer. The stimulus is impoverished: there are only lines on a white surface. So the perceptual system can't gather useful information about the scene by changing vantage point. Hohwy hypothesizes that subjects viewing the Müller-Lyer lines try to override the role of low-level prior hypotheses by shifting attention, but can't, because the stimulus conditions are too sparse (p. 127). Here Hohwy highlights the role that could be played by moving around to get a different view, if the stimulus contained richer information. Still, without saying more, it's unclear why moving around would be the only way to override the initial perceptual processing in such cases.

Hohwy's more promising suggestion is that the perceptual experience is generated using only low-level hypotheses, because there's relatively little noise in the signal. Although the stimulus conditions are sparse, the lines are clear. There's little external noise. Hohwy argues that, as a result, little prediction error is passed on to cognitive levels of the hierarchy, because the signal is fully explained away at low levels. Without much sensory signal passed to cognitive levels, cognitive states can't have much influence on the resulting experience (pp. 125-126).

Hohwy's proposal is intriguing, but it seems to me that the appeal to uncertainty and noise doesn't explain every recalcitrant illusion. In the footsteps motion illusion (Antsis 2003) a dark object traverses a background of vertical black and white stripes. When the object crosses a black stripe, it's at low-contrast with the background and appears to slow down. When it crosses a white stripe, it's at high-contrast with the background and appears to speed up. In fact, its speed is constant. Weiss, Simoncelli, and Adelson (2002) proposed a Bayesian model for related effects. According to the proposal, the brain has a prior expectation for slower movement because objects in general tend to be at rest or low velocity. In such illusions, when the object is at low-contrast the signal is noisy, so the system relies more on prior expectations of slow movement. On Hohwy's explanation, we might expect the subject to be able to override the footsteps illusion through cognitive penetration, since the illusion is due to noise and uncertainty. However, the footsteps illusion is recalcitrant. This suggests that more than an appeal to uncertainty and noise may be needed to explain impenetrability in recalcitrant illusions. More generally, it suggests that Hohwy's proposal doesn't fully account for cognitive impenetrability within the PEM framework.

5. Conclusion

I've presented core aspects of prediction error minimization and two applications of the framework as Hohwy develops it. I've also raised several objections. I close by emphasizing agreement. Hohwy approaches his elaboration and defense of the PEM framework in an exploratory spirit (p. 7). He offers his proposals not as definitive, but as promising ways forward for an ambitious new theory of how the brain works. I've offered my criticisms in the same spirit. I'm interested to see where PEM goes next. I predict that Hohwy's book will be an important part of the discussion.

ACKNOWLEDGEMENTS

For helpful discussions, I thank David Bennett, Ned Block, Zoe Jenkin, Alex Kiefer, Susanna Siegel, and Jakob Hohwy.

REFERENCES

Anstis, S. (2003). Moving objects appear to slow down at low contrasts. Neural Networks, 16(5), 933-938.

Le, Q. V. (2013). Building high-level features using large scale unsupervised learning. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on Machine Learning (pp. 8595-8598). IEEE.

Pellicano, E., and Burr, D. (2012). When the world becomes 'too real': a Bayesian explanation of autistic perception. Trends in cognitive sciences, 16(10), 504-510.

Qian, N., and Lipkin, R. M. (2011). A learning-style theory for understanding autistic behaviors. Frontiers in human neuroscience, 5.

Weiss, Y., Simoncelli, E. P., and Adelson, E. H. (2002). Motion illusions as optimal percepts. Nature neuroscience, 5(6), 598-604.