Significance To survive in changing environments animals must use sensory information to form accurate representations of the world. Surprising sensory information might signal that our current beliefs about the world are inaccurate, motivating a belief update. Here, we investigate the neuroanatomical and neurochemical mechanisms underlying the brain’s ability to update beliefs following informative sensory cues. Using multimodal brain imaging in healthy human participants, we demonstrate that dopamine is strongly related to neural signals encoding belief updates, and that belief updating itself is closely related to the expression of individual differences in paranoid ideation. Our results shed new light on the role of dopamine in making inferences and are relevant for understanding psychotic disorders such as schizophrenia, where dopamine function is disrupted.

Abstract Distinguishing between meaningful and meaningless sensory information is fundamental to forming accurate representations of the world. Dopamine is thought to play a central role in processing the meaningful information content of observations, which motivates an agent to update their beliefs about the environment. However, direct evidence for dopamine’s role in human belief updating is lacking. We addressed this question in healthy volunteers who performed a model-based fMRI task designed to separate the neural processing of meaningful and meaningless sensory information. We modeled participant behavior using a normative Bayesian observer model and used the magnitude of the model-derived belief update following an observation to quantify its meaningful information content. We also acquired PET imaging measures of dopamine function in the same subjects. We show that the magnitude of belief updates about task structure (meaningful information), but not pure sensory surprise (meaningless information), are encoded in midbrain and ventral striatum activity. Using PET we show that the neural encoding of meaningful information is negatively related to dopamine-2/3 receptor availability in the midbrain and dexamphetamine-induced dopamine release capacity in the striatum. Trial-by-trial analysis of task performance indicated that subclinical paranoid ideation is negatively related to behavioral sensitivity to observations carrying meaningful information about the task structure. The findings provide direct evidence implicating dopamine in model-based belief updating in humans and have implications for understating the pathophysiology of psychotic disorders where dopamine function is disrupted.

To successfully navigate the world we need to exploit sensory information to make inferences about the environment (1). For example, before crossing the road it is sensible to check the traffic lights at a pedestrian crossing to decide whether it is safe to cross or not, drawing on our cognitive model of what traffic lights (the observable information) tell us about the traffic flow (the partially observable, or hidden, environmental state). When the light changes from the “red man” to the “green man” this should cause us to update our belief about the state of the environment to infer it is now safe to cross. Importantly, however, it is also critical to assess the informativeness of any sensory input. For example, although it would be surprising to see both the green and red lights on simultaneously, it is not advisable to update one’s beliefs about traffic flow based on this observation alone. Thus, adaptive behavior depends on an ability to discriminate between observations carrying relevant information for the task at hand (informative or meaningful cues) and observations carrying irrelevant, ambiguous, or no information (noninformative or meaningless cues). The former should induce updates in an agent’s model of the world, whereas the latter should not (2).

Dopamine may play a key role in the processing of meaningful sensory information. Phasic activity in midbrain dopamine neurons is implicated in processing unexpected and salient environmental stimuli (3), including those that are novel (4⇓–6) and associated with reward (7, 8). More recent evidence suggests a role for dopamine in updating a rich internal model of the task environment, necessary for flexible behavior (9⇓–11). Specifically, phasic midbrain dopamine signals can reflect inferences about the identity of hidden task states (12, 13) and encode value-neutral prediction errors (14, 15), as well as support stimulus–stimulus associative learning (10). Here, we test whether dopamine is associated with the processing of meaningful sensory information in humans, so as to allow an agent to make inferences on a sensory input and appropriately update their internal representations of the environment.

Meaningful information can be formally quantified as the degree to which a new observation changes an agent’s prior belief about the current state of the world, given previous observations, to a new (posterior) belief. The magnitude of this belief update from a ‘prior’ belief to a ‘posterior’ belief is usually quantified as the Kullback–Leibler divergence (D KL ) and has been termed “Bayesian surprise” (SI Appendix, Eq. S8) (2, 16).

Belief updates occur after unexpected observations, but unexpectedness alone should be insufficient to motivate change in an agent’s internal representations. As outlined in our example above, unexpected observations that are equally unlikely under all competing hypotheses about the environment contain no meaningful information with respect to the hidden state. The improbability of an observation, given an agent’s prior expectation, is often quantified in terms of information-theoretic surprise (I S , or “surprisal”), which can be thought of as “counter evidence” to an agent’s representation of the world (SI Appendix, Eq. S9).

The distinction between the pure unexpectedness (information-theoretic surprise) of an observation and its meaningful information content (Bayesian surprise) is central to understanding how new information influences adaptive behavior and may also be of relevance for understanding psychotic symptoms in schizophrenia. One theoretical formulation postulates that stimulus-locked dopamine neural activity is important for processing salient stimuli, and that maladaptive dopaminergic activity in response to ambiguous, unreliable, or behaviorally irrelevant (meaningless) events leads to aberrant attribution of salience to these same events. This in turn is thought to underpin misattributional symptoms such as paranoia (17⇓⇓⇓⇓⇓–23). Of note, the detection of behaviorally salient stimuli involves a number of brain circuits that modulate the firing of dopamine neurons in the midbrain. In particular, the anterior hippocampus has a key role in regulating midbrain dopamine neuron activity depending on the novelty and context of stimuli via a circuit that involves the nucleus accumbens and ventral pallidum (5, 20, 24).

An understanding of the mechanisms underlying belief updating is therefore critical for understanding both the generation of complex goal-directed behaviors and symptoms of certain neuropsychiatric disorders. Recent fMRI studies have begun to investigate the neural correlates of belief updating in humans, showing that encoding of unsigned belief updates (but not simple unexpectedness) is present in dopamine-rich midbrain regions, specifically the ventral tegmental area (VTA) and substantia nigra (SN) (25⇓–27). However, to date, there is no evidence linking direct measures of dopamine function to belief updating in humans.

We investigated a dopaminergic basis for belief updating using a model-based fMRI task, combined with PET imaging of dopamine function. We used a task that separates Bayesian surprise, information-theoretic surprise, and reward prediction errors, on a trial-by-trial basis (Fig. 1) (27). In brief, during the task participants (n = 39) need to track which of two (hidden) task states pertained at every trial, based on imperfectly informative observations about state identity. Specifically, they were tasked to infer whether visual or auditory cues were currently relevant for predicting monetary outcomes, where the relevant modality signaled the sign of the monetary outcome with ∼90% cue validity. The identity of the relevant modality reversed (switched) periodically. Participants were not explicitly informed of the validity of the relevant cue or the reversal probability but were thoroughly trained on the task before scanning.

Fig. 1. fMRI task showing two example trials, one informative and one noninformative. The task contained two auditory and two visual cues, with one cue from each modality being predictive of a monetary win and the other of a monetary loss (∼90% validity). Trials started with the simultaneous presentation of one visual and one auditory cue, followed by a monetary outcome (gains or losses from 10 to 30 pence). For any given trial only one cue modality was relevant for predicting the outcome, and the identity of the relevant cue switched five to six times in a session of 60 trials. The goal of the task was to correctly track the identity of the relevant cue modality (i.e., the hidden task state) at each trial, using information from cue-outcome observations. At the end of each trial participants reported their belief about the identity of the relevant modality using a rating scale. Half of the trials were noninformative, in that the visual and auditory cues predicted the same (congruent) monetary outcome, while the other half were informative, in that auditory and visual cues predicted incongruent outcomes. Unexpected outcomes in both informative and noninformative trials had positive information-theoretic surprise (I S ), but these events were only associated with positive Bayesian surprise (D KL ) in informative trials.

At the start of each trial, two cues (one auditory, one visual) were presented simultaneously and could either be incongruent or congruent in their monetary predictions. Following cue presentation participants observed a monetary outcome (either a win or a loss) and subsequently indicated their belief about the relevant predictive modality (current environmental state) on a rating bar (Fig. 1). Monetary outcomes that were unexpected under a current prior hypothesis (rendering I S > 0) could provide either meaningful (D KL > 0, in incongruent trials) or meaningless (D KL = 0, in congruent trials) information regarding the identity of the task-relevant modality. This design allows a decorrelation of Bayesian (D KL ) and information-theoretic (I S ) surprise (27, 28), enabling us to identify the neural signature of each construct. We hypothesized that belief updates (correlating with the meaningful information content of an observation), but not sensory unexpectedness, would be encoded in dopamine-rich brain areas, namely the SN/VTA complex and ventral striatum, in line with predictions from previous findings (25⇓–27). Moreover, we tested whether deviations from optimal behavior in this task were related to the presence of subclinical paranoid thoughts, a key prediction of the aberrant salience hypothesis of schizophrenia.

To test directly the role of dopamine in these processes, we used PET with the dopamine-2/3 receptor (D2/3R) agonist ligand [11C]-(+)-4-propyl-9-hydroxy-naphthoxazine ([11C]-(+)-PHNO) at baseline (n = 36) and following 0.5 mg/kg dexamphetamine challenge (n = 17). The baseline [11C]-(+)-PHNO PET scan measures D2/3 autoreceptor availability in the midbrain, which are inhibitory receptors (29⇓–31). We hypothesized that greater midbrain D2/3R availability, reflecting greater tonic inhibitory tone, would be negatively related to phasic midbrain neural response during belief updates (4). Following acute amphetamine challenge there is an increase in dopamine concentration in the striatum, consequent upon blockade of dopamine reuptake (4, 32), and also possibly due to increased dopamine neuron firing (33⇓–35). Greater dexamphetamine-induced dopamine release is thought to be associated with more spontaneous dopamine transients in the drug-free state, indicating a lower signal-to-noise ratio in dopaminergic signaling (17). Consequently, we hypothesized that greater striatal dopamine release capacity would be associated with lower ventral striatal neural response during belief updates. Finally, by measuring the D2/3R availability in the striatum at baseline, we were able to test a hypothesized inverted-U relationship between cognitive flexibility and striatal dopamine function at rest (36).

Discussion Controlling for the effects of signed reward prediction errors, we show that the SN/VTA and ventral striatum encode meaningful information content in sensory observations. This encoding reflected solely the magnitude of belief updates regarding the current environmental state (Bayesian surprise from prior beliefs to posterior beliefs), but not the simple unexpectedness of an observation (information-theoretic surprise). Using in vivo PET imaging of dopamine we also demonstrate that neural activity encoding belief updates is negatively related to D2/3R availability in the midbrain, and dopamine release capacity in the striatum. These results provide a direct link between belief updating and dopaminergic function, extending observations from previous fMRI studies that implicate SN/VTA in encoding the magnitude of belief update signals on the one hand (25⇓–27) and the assumed role of dopamine in an implementation of probabilistic inference on the other (41, 42). Additionally, we show that participants’ trial-by-trial sensitivity to the meaningful information content of observations has an inverted-U relationship with striatal baseline D2/3R availability, in line with evidence that striatal D2/3R signaling has an inverted-U relationship with cognitive flexibility (36). Our results therefore shed light on the neurochemical basis of belief updating in humans using in vivo quantification of dopamine function. The [11C]-(+)-PHNO signal in the SN/VTA primarily indexes D3 autoreceptor availability (30, 43, 44) and the signal here is less sensitive to tonic synaptic dopamine levels compared with the striatum (45). D2/3R availability was negatively related to neural activity encoding belief updates in the SN/VTA complex, consistent with evidence that midbrain D3Rs have an inhibitory effect on dopaminergic neurons (29, 31), and in line with the notion that tonic dopamine signaling may regulate the amplitude of stimulus-locked phasic dopamine neuron activity (4). For example, D3R knockout mice have elevated extracellular dopamine levels in the nucleus accumbens (46), while mice treated with D3R-preferring agonists show reduced dopamine concentration in the accumbens (47). In a recent fMRI study, selective antagonism of the D3R enhanced midbrain and ventral striatal fMRI activation during anticipation of monetary reward, providing indirect evidence for an inhibitory role for midbrain D3Rs in humans (48). The behavioral significance of elevated midbrain D2/3R availability has also recently been investigated in rats, where nigral [11C]-(+)-PHNO B P N D correlated with impaired reversal learning in a probabilistic reward task (49). Our findings extend this work by showing that natural variation in human midbrain D2/3R availability is associated with altered midbrain activation during belief updating, with lower levels associated with relatively greater activation. Moreover, our task design allowed us to investigate the specific role of dopamine in encoding the meaningful information content of an observation, decorrelating this construct from simple unexpectedness and reward prediction error. We found that a belief update signal in the ventral striatum was negatively correlated with dexamphetamine-induced striatal dopamine release capacity, providing in vivo human evidence that this signal is related to dopamine function. This complements findings from a recent optogenetic fMRI study in rats, which demonstrated that striatal blood-oxygen-level-dependent (BOLD) activations may be driven by mesolimbic dopamine neuron firing (50). It has been proposed that greater amphetamine-induced dopamine release capacity in vivo corresponds to a greater tendency toward spontaneous dopamine neuron firing in the drug-free (baseline) state, which decreases the signal-to-noise ratio of stimulus-locked dopamine bursts (17). Our finding that striatal dopamine release capacity is negatively correlated with the striatal BOLD response encoding belief updates is therefore consistent with current hypotheses regarding the relationship between amphetamine-induced dopamine release capacity and mesostriatal dopaminergic function at rest. Moreover, this finding extends our understanding by showing a negative relationship between the natural variation in dopamine release capacity in humans and adaptive neural activation in the ventral striatum. However, it is important to note that the relationship between spontaneous dopamine neuron firing and amphetamine-induced dopamine release has yet to be tested, and that, while some studies report that amphetamine’s action is dependent on neuronal firing within the VTA (33, 34), acute amphetamine administration has generally been found to reduce dopamine neuron firing (51⇓–53), as well as having other actions to increase striatal dopamine levels (4, 32, 54). Thus, preclinical studies that combine PET and dopamine neuron recordings would be useful to test the hypothesis that spontaneous dopamine neuron firing in the amphetamine-free state is directly associated with dopamine release induced by amphetamine. Consistent with a previous study using the same task (27), information-theoretic surprise was encoded in frontal brain areas including pre-SMA. We also replicated the finding that the effect size of this activation positively correlated with task performance, suggesting that surprising events may be imbued with higher salience in participants with a better model of the task (resulting in better performance) (27). Importantly, there was no relationship between the effect size of the neural response in this region and any PET measure of dopamine function, favoring a more specific role for dopamine in encoding meaningful information. An influential model proposes that the anterior hippocampus regulates midbrain dopamine neuron firing depending on the novelty and context of stimuli through the descending arm of a hippocampal–VTA loop. Activity in projections from the VTA to the hippocampus, constituting the ascending arm of the loop, in turn facilitate the updating of memory by enhancing long-term potentiation in the hippocampus (5, 20). However, we found no evidence for increased hippocampal activity at cue onset, and there was no positive correlation between hippocampal activation and either meaningful (Bayesian) or meaningless (information-theoretic) surprise at monetary outcome. It should be noted, however, that our task was not optimized to detect event-related hippocampal activity relating to novelty processing or learning, as participants had been thoroughly trained on the task stimuli and structure before scanning. Nevertheless, further studies are required to investigate the relationship between prediction error signals (e.g., in the midbrain and orbitofrontal cortex) and hippocampal representations, given the proposed role of the hippocampus in the learning and remapping of internal models (“cognitive maps”) (11, 26, 55⇓–57). It has also been suggested that a connection from the medial prefrontal cortex to the dopaminergic midbrain may convey information relating to inference about the environment (specifically, inference over possible hidden states of a task) (12). In line with this finding we found that belief updates were encoded in the medial frontal cortex, including dorsal anterior cingulate. This observation is consistent with previous human and nonhuman primate studies (26⇓–28, 58, 59) as well as with suggestions that anterior cingulate cortex is active in novel or volatile environments wherein agents need to refine their internal models in light of new observations (28, 60). Moreover, we also detected activation encoding belief updates in lateral prefrontal and posterior parietal cortical regions, which have been implicated in inference on the nature of the causal relationships between observations (hidden causal structures) (61) and in encoding state prediction errors that support learning an internal model of a task (state–action–state transition probabilities) (62). The ventral striatum and SN/VTA are implicated in encoding signed reward prediction errors that update action and state values (7, 63, 64). Ventral striatal encoding of these model-free reward prediction errors may be negatively related to ventral striatal dopamine synthesis capacity (65, 66). Consistent with previous studies using similar task designs (27, 67), we did not find strong evidence for effects within these regions for signed reward prediction errors. Previous studies have shown that the processing of reward anticipation and prediction error in the mesolimbic dopamine circuit is sensitive to current task demands, including action planning (68⇓–70). In our task participants were not attempting to maximize reward, and the observation of monetary gains vs. losses was not indicative of task performance. Furthermore, unexpected outcomes were equally informative about changes in relevant cue modality, regardless of whether they took the form of a monetary gain or loss. Thus, the important contribution of our results is to highlight dopamine’s role in signaling belief updates beyond its role in signaling signed reward prediction errors, an observation that hints at a role for dopamine in probabilistic inference and structural learning. Consistent with this interpretation, a recent study employing electrophysiological recordings in behaving rats demonstrated that midbrain dopamine neurons that signal classical signed reward prediction errors also signal value-neutral sensory prediction errors (14). Moreover, in humans the magnitude of value-neutral “stimulus identity” prediction errors in the midbrain is related to updates in state representation in the orbitofrontal cortex (71). The implication here is that dopamine has a wide range of functions that extends to updating a predictive associative model of the world, suggesting phasic dopamine activity signals a more general error signal, where value errors are a special case (10, 14). The findings of our study are highly relevant for dopaminergic and neurocomputational theories of schizophrenia (59, 72). The aberrant salience hypothesis proposes that symptoms such as paranoia arise when unwarranted meaning and behavioral salience is attributed to ambiguous, irrelevant, or unreliable stimuli (17, 18, 20⇓⇓–23). This is suggested to reflect maladaptive phasic dopamine signaling in a mesostriatal circuit, activity that underpins learning of cue values and associations under normal circumstances (7, 10, 14). Our results speak to this hypothesis in two ways. First, subclinical paranoia was negatively related to behavioral sensitivity to the meaningful information content of an observation, and also to the degree to which a participant’s performance correlated with that of an ideal Bayesian observer. This suggests that maladaptive belief updating (i.e., updating one’s beliefs following ambiguous or meaningless observations) may contribute to the formation of subclinical paranoid beliefs. Second, by dissociating the meaningful information content of an observation from its simple unexpectedness, and showing a dopaminergic relationship with the former, our findings point to the possibility of advances that might accrue from reformulating constructs such as “salience” in a more mathematically rigorous fashion. In fact, one hypothesis from our findings is that the central feature of “aberrant salience” in psychotic disorders is a failure to dissociate between meaningful (task-relevant) and meaningless (task-irrelevant) information, resulting in belief updating arising out of merely surprising inputs (59).

Conclusions Using model-based fMRI we demonstrate that activity within both the midbrain and ventral striatum correlates with the magnitude of a belief shift following an observation, indicating that these structures encode the meaningful information content of a stimulus, as opposed to its simple unexpectedness (surprise). Moreover, using PET we demonstrate a potential dopaminergic basis for these neural signals. Specifically, neural encoding in the midbrain was negatively related to midbrain D2/3R availability, while encoding in the striatum was negatively related to striatal dopamine release capacity. Finally, we show that participants who displayed the least sensitivity to the meaningful content of observations also reported greater subclinical paranoid ideation. Together, our results suggest that the role of phasic mesolimbic dopamine activity extends beyond its well-established role in signaling signed reward prediction errors and includes updating a rich internal model of the word capable of supporting flexible behavior. Furthermore, our findings have relevance for understanding the pathophysiology of psychotic disorders such as schizophrenia, which are characterized by mesostriatal dopamine abnormalities and symptoms arising from aberrant inferences about the world, as manifested in delusions.

Acknowledgments We thank the MRI and PET technicians and radiographers at the Imanova Centre for Imaging Sciences, and Mark Ungless and Robert McCutcheon for comments on the initial manuscript. This study was funded by Medical Research Council-United Kingdom (UK) Grant MC-A656-5QD30 and Wellcome Trust Grant 094849/Z/10/Z (to O.D.H.) and the National Institute for Health Research Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. M.M.N. is supported by the National Institute for Health Research UK. T.D. is supported by EU-FP7 MC6 ITN IN-SENS Grant 607616. R.A.A. is supported by Academy of Medical Sciences Grant AMS-SGCL13-Adams and National Institute for Health Research Grant CL-2013-18-003. R.J.D. is supported by Wellcome Senior Investigator Award 098362/Z/12/Z. The Max Planck UCL Centre is a joint initiative supported by UCL and the Max Planck Society.

Footnotes Author contributions: M.M.N., T.D., P.S., R.A.A., T.H.B.F., M.B.W., R.J.D., and O.D.H. designed research; M.M.N., T.D., and R.A.A. performed research; P.S., T.H.B.F., and C.C. contributed analytic tools; M.M.N., T.D., P.S., and R.A.A. analyzed data; and M.M.N. wrote the paper with contributions from P.S., R.A.A., T.H.B.F., R.J.D., and O.D.H.

Conflict of interest statement: O.D.H. has received investigator-initiated research funding from and/or participated in advisory/speaker meetings organized by Astra-Zeneca, Autifony, BMS, Eli Lilly, Heptares, Jansenn, Lundbeck, Lyden-Delta, Otsuka, Servier, Sunovion, Rand, and Roche. Neither O.D.H. nor his family have been employed by or have holdings or a financial stake in any biomedical company.

This article is a PNAS Direct Submission. R.M. is a guest editor invited by the Editorial Board.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1809298115/-/DCSupplemental.