Participants

Forty healthy participants with no history of psychological or neurological disorders were recruited from the University of Southern California community and the surrounding Los Angeles Area (mean age: 24.30 ± 0.92 years, range: 18–39 years, 20 male). All participants were right-handed according to their own report. Subjects were paid $20 per hour for their participation and gave informed consent. All experimental protocols were approved by the Institutional Review Board of the University of Southern California and procedures were carried out in accordance with the approved guidelines. All participants had spent the majority of their life living in the United States and spoke fluent English, identified themselves as politically liberal, and had strongly held political and non-political beliefs. Specifically, participants answered a screening questionnaire in which they were asked about their political identification. On the question “Do you consider yourself a political person?” answers ranged on a scale from 1 (not at all) to 5 (very much). Participants were only included if they answered at least a 4 on this question. For the question “Which of the following describes your political self-identification?” answers ranged from 1 (strongly liberal) to 7 (strongly conservative) and participants were only included if they answered 1 or 2. Additionally, participants rated their agreement with several political and non-political statements and were only included in the experiment if they strongly agreed with at least 8 political and 8 non-political statements. Of 116 people who responded to our advertisements, 98 met the requirements for age, handedness, and political orientation. From those 98 people, 40 subjects met the requirements for strongly agreeing to at least 8 statements in each category.

Stimuli

In this experiment, each participant read 8 political statements and 8 non-political statements with which they had previously indicated strong agreement. Each statement was followed by 5 challenges. Each challenge was a sentence or two that provided a counter-argument or evidence against the original statement.

The 8 political statements for each participant were drawn from a pool of 9 political statements. These statements concerned policy issues on which we expected predictable, identity-consistent positions from our subjects, such as “Abortion should be legal” and “Taxes on the wealthy should generally be increased”. The statements can be found in full in Table S3. The 8 non-political statements were drawn from a pool of 14 non-political statements. The pool of non-political statements was larger because while the inclusion criteria guaranteed the participants would hold certain political beliefs, they did not guarantee belief in any specific non-political statement. The non-political statements covered a wide range of topics including health (e.g. “Taking a daily multivitamin improves ones health”), education (e.g. “A college education generally improves a person’s economic prospects”), and history (e.g. “Thomas Edison invented the light bulb”).

Each political and non-political statement was associated with 5 challenges. In order to be as compelling as possible, the challenges often contained exaggerations or distortions of the truth. For instance, one challenge to the statement “The US should reduce its military budget” was “Russia has nearly twice as many active nuclear weapons as the United States”. In truth, according to statistics published by the Federation of American Scientists: Status of World Nuclear Forces (www.fas.org) in 2013, Russia has approximately 1,740 active nuclear warheads, while the United States has approximately 2,150. Examples of the challenges are provided in Table S4.

The political and non-political statements did not differ in number of words (political: 11.22 ± 1.51, non-political: 11.14 ± 1.33, p = 0.97), letters (political: 59.33 ± 7.71, non-political: 58.64 ± 6.04, p = 0.94), or Flesch reading ease (political: 60.7 ± 17.89, non-political: 48.3 ± 28.64, p = 0.26)24. The political and non-political challenges also did not differ in number of words (political: 20.44 ± 2.83, non-political: 18.92 ± 1.13, p = 0.15), letters (political: 104.18 ± 15.50, non-political: 96.24 ± 6.08, p = 0.15), or Flesch reading ease (political: 53.9 ± 21.02, non-political: 55.74 ± 19.11, p = 0.65).

Because we were interested in brain structures that are known to respond to social and mental stimuli, we used a word counting method to count the frequency of social and cognitive words within the stimuli. This technique is similar to linguistic inquiry word counting (LIWC25), but used the open-source software tool TACIT26 version 1.0.0 in combination with the LIWC 2007 dictionary to count words in the social and cognitive process categories. We found that such words were infrequent in our stimuli, and similar across the two categories (social words occurred with a frequency of 5.07% in the political challenges and 5.35% in the non-political challenges; cognitive words occurred with a frequency of 10.14% in the political challenges and 11.9% in the non-political challenges).

Experimental Procedure

In preparation for the study, participants filled out a survey of demographic information, answered questions about their political and religious affiliations, and indicated the degree to which they agreed with political and non-political statements. Only statements for which participants chose 6 or 7 (where 1 was strongly disagree and 7 was strongly agree) were used during their scan. If a given subject strongly believed more than 8 statements in a category, the statements were chosen for that subject as follows: first, preference was given to more strongly held beliefs (7 vs. 6). Second, all else being equal, preference was given for statements that were not as commonly believed, in order to balance the frequency of statements in the experiment.

When participants arrived for their fMRI scan, they were given instructions and were given the opportunity to ask questions of the experimenter. After the instructions, they performed a practice task, which consisted of a shortened version of one trial of the experiment using the statement “Cats make better pets than dogs”. followed by three challenges to that statement. Following the practice task, participants underwent BOLD fMRI. For each participant there were 4 belief-challenging scans (420 seconds each). During the belief-challenging scans, each statement was presented for 10 seconds, followed by a variable delay of 4–6 seconds. Participants were instructed to press a response button when they had read and understood the statement. Five challenges to the original statement were then presented, each for 10 seconds. Again, participants pressed a response button when they had read and understood the challenge. After all five challenges had been presented, the original statement was presented again and participants had 12 seconds to rate their strength of belief in the statement. The participant indicated his or her response via a button press on an MRI-compatible button box held in the right hand. They pressed buttons to move a cursor left and right along a Likert scale to indicate the strength of their belief on a scale from 1 (strongly disbelieve) to 7 (strongly believe). The cursor started in the middle position of the scale. Two political and two non-political statements were presented in each of the four fMRI scans. The order of these conditions was randomized within each scan, and the statements within each condition were assigned random positions within the experiment for each subject. The temporal structure of the trials and runs is depicted in Fig. S1.

Following the fMRI session, participants filled out a short questionnaire. They were asked to rate how credible they found the challenges in general, and how challenging they were to their beliefs. Participants did not make separate ratings per item or per category, but rather answered these questions about their reaction to the stimulus set in general. During the debriefing, subjects were given a packet of sourced information which detailed the truth of each challenge they read inside the scanner and provided resources on where to find further information.

MRI Scanning

Imaging was performed using a 3T Siemens MAGNETON Trio System with a 12-channel matrix head coil at the Dana and David Dornsife Neuroscience Institute at the University of Southern California. Functional images were acquired using a gradient-echo, echo-planar, T2*-weighted pulse sequence (TR = 2000 msec, one shot per repetition, TE = 25 msec, flip angle = 90°, 64 × 64 matrix, phase encoding direction anterior to posterior, GRAPPA acceleration factor = 2, fat-sat fat suppression). Forty slices covering the entire brain were acquired with an in-plane voxel resolution of 3.0 × 3.0 and a slice thickness of 3.4 mm with no gap. Slices were acquired in interleaved ascending order, and 210 functional volumes were acquired in each run, not including 3 volumes discarded by the scanner to account for T1 equilibrium effects. A gradient-echo field map was also acquired with the same slices and resolution as the functional images using a Siemens field map sequence (TR = 1000 ms, TE1 = 10 ms, TE2 = 12.45 ms, flip angle = 90°, 64 × 64 matrix).

A T1-weighted high-resolution (1 × 1 × 1 mm) image was acquired using a three-dimensional magnetization-prepared rapid acquisition gradient (MPRAGE) sequence (TR = 2530 msec, TE = 3.13 msec, flip angle = 10°, 256 × 256 matrix, phase encoding direction right to left, no fat suppression). Two hundred and eight coronal slices covering the entire brain were acquired in interleaved order with a voxel resolution of 1 × 1 × 1 mm. We also collected a T2-weighted anatomical scan (TR = 10,000 ms, TE = 88 ms, flip angle = 120°, 256 × 256 matrix) with 40 transverse slices with a voxel resolution of 0.82 × 0.82 × 3.5 mm that was reviewed by a radiologist to rule out incidental findings.

fMRI Data Analysis

fMRI analysis was performed using FEAT version 6.00, FSL’s fMRI analysis tool (FMRIB’s Software Library http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/) and other FSL tools from FSL version 5.0.8. Data were first corrected for magnetic field inhomogeneities using the field maps acquired for each subject and FSL’s FUGUE utility for geometrically unwarping EPIs, unwarping in the anterior-posterior (−y) direction with a 10% signal loss threshold. Data were then preprocessed using standard steps in the following order: motion correction using a rigid-body alignment to the middle volume of each run, slice-timing correction using Fourier-space time-series phase-shifting, removal of skull using FSL’s BET brain extraction tool, 5 mm FWHM spatial smoothing, and highpass temporal filtering using Gaussian-weighted least-squares straight line fitting with a sigma of 60 s (corresponding to a period of 120 s). Finally, temporal autocorrelation was removed using FSL’s prewhitening algorithm before statistical modeling27.

The skull was removed from the T1 images using the BET brain extraction tool with a fractional intensity thresholding setting of 0.4, and specifying the voxel that represented the approximate center of the brain. We then used FLIRT to register the functional images to the skull-stripped T1-weighted MP-RAGE using its boundary-based registration (BBR) algorithm. Next, the MP-RAGE was registered to the standard MNI atlas with a 12 degrees of freedom affine transformation, and then this transformation was refined using FNIRT nonlinear registration with a warp resolution of 10 mm28.

Data were then analyzed within the General Linear Model using a multi-level mixed-effects design. Each component of the task (statement, challenge, and rating) was modeled by convolving the task design with a double-gamma hemodynamic response function with a phase of 0 s. The task periods were defined from stimulus onset to stimulus offset. Political and non-political trials were modeled using separate regressors, yielding six task regressors. The temporal derivative of each task regressor and six motion correction parameters were also included in the design. At the individual subject level, statistical maps were generated for each functional scan. These were then combined into individual participant-level maps in a fixed effects analysis across each subject’s four scans. Subject-level maps were then entered into a higher-level group analysis to examine group-level effects using a “mixed effects” design with FLAME1.

To explore the relationship between the degree of belief change and brain activity in response to specific statements, we also performed a whole brain item-wise analysis. In this analysis, we first modeled each lower-level run with a design that specified a single regressor for each trial’s statements and challenges. Therefore, in this design, there were 8 task regressors (4 statements and 4 challenge periods) in addition to the six motion parameters. Task periods were modeled as in the previous analysis, using the time from stimulus onset to offset convolved with a double-gamma hemodynamic response function with phase 0 s. We then computed brain-activity maps for each specific stimulus item, combining across all subjects who read that stimulus using a second-level FLAME1 “mixed effects” design to produce item-level activity maps. These item-level activity maps were then tested for correlation with the average belief change across items in a third-level FLAME1 design that included belief change as a between-items covariate.

For all whole-brain analyses, statistical thresholding was performed using FSL’s cluster thresholding algorithm to control for multiple comparisons. This algorithm uses Gaussian Random Field Theory to estimate the probability of clusters of a given size taking into account the smoothness of the data. We used a Z threshold of 2.3, and a cluster size probability threshold of p < 0.05.

In addition to whole brain analysis, we performed a region of interest (ROI) analysis focusing on a-priori ROIs in the amygdala and insular cortex. We chose these two ROIs because of their well-known roles in emotion and feeling. For this analyis, beta values from the GLM analysis were extracted for each subject, and averaged within each ROI. The contrast used for this analysis combined activity from the period when participants were reading all political and nonpolitical challenges to their beliefs. Because there was very little belief change for political statements, we used belief change on non-political statements as our measure of individual variability. The beta values and the average belief change scores were subjected to a Shapiro-Wilk test for normality. These values were then correlated with each participant’s average belief change score in a Pearson’s correlation. The regions of interest were defined as follows: For the amygdala, we used the Harvard-Oxford Atlas amygdala mask, thresholded at 25. For the insula, we used masks of the dorsal anterior, ventral anterior, and posterior insula defined by a study that performed a cluster analysis of functional connectivity patterns29.

Followup Questionnaire

Following their participation in the fMRI portion of the study, participants were sent an on-line questionnaire asking them to indicate how strongly they agreed with each statement they had seen during their fMRI scan. The average time between a participant’s scan and completing the questionnaire was 48.36 ± 5.85 days.