Experiment 1: Susceptibility to a group’s punishment preferences as Last Decider

We first examined whether decisions to restore justice are malleable, and if so, whether these moral preferences are sensitive to the proportion of other individuals endorsing a punitive response in the wake of a fairness violation. Subjects completed a variant of the Justice Game, an economic task specifically designed to measure punitive and non-punitive responses to fairness violations33. In this task, Player A is endowed with $10 on each trial and can choose how much to split with the subject, who always takes the role of Player B (Fig. 1a). Player A’s splits ranged from mildly unfair (a 6/4 split in which Player A keeps $6 and offers $4) to highly unfair (9/1 split) in $1 increments. After receiving an unfair offer, Player B then decided how to restore justice by choosing to: (i) Accept Player A’s split as-is; (ii) Compensate themselves in a cost-free manner by increasing their own payout to match Player A’s payout, without punishing Player A; or (iii) Reverse Player A’s split, a highly retributive option that maximizes Player B’s own payout and minimizes Player A’s payout (Fig. 1a).

Figure 1 Task and trial structure for Experiment 1 (Last Decider) and 2 (First Decider). (a) In the modified Justice Game, Player A is endowed with $10 at the beginning of each trial. Player A offers some portion of that money with Player B, who can choose to Accept, Compensate, or Reverse the offer. In our task, Player As always made unfair offers. Because Player Bs could maximize their payout by choosing either Compensate or Reverse, decisions to Reverse can be interpreted as cost-free punishment for Player As. (b) Subjects in the Last Decider experiment observed their group’s punishment preferences prior to indicating their own punishment preference. (c) Subjects in the First Decider experiment indicated their punishment preference prior to observing their group’s preferences. Although group members are represented by icons in this figure, subjects were shown photographs in the actual experiment. Full size image

Subjects first completed a Solo Phase of this task, allowing us to measure subjects’ punishment preferences in the absence of any social influence. Accounting for the possibility that subjects might infer strategic motives behind Player A offers, payout at the end of the study was implemented probabilistically such that Player A’s original offer was enacted half of the time and Player B’s decision was enacted half of the time (see Supplementary Methods). Indeed, subjects Reversed at greater rates as offers became increasingly unequal, indicating that they perceived small offers as unfair and punished the perpetrator accordingly. Subjects played with new Player As on each trial.

Subjects then completed a Group Phase of the task, in which they were told that they were sharing the role of Player B alongside four other subjects, and that their responses would collectively determine the payouts of all the players (Fig. 1b). Specifically, subjects were informed that the payout redistribution would be determined by the majority choice of the five Player Bs, that Player A’s payout would be affected by Player Bs’ collective decision just as it had in the Solo Phase, and that each Player B would receive the full payout. As in the Solo Phase, each of the players sharing the role of Player B could choose to Accept, Compensate, or Reverse Player A’s split, and subjects observed the other Player Bs’ choices in sequence, as if they were made in real time (Fig. 1b). Subjects always responded last, following the format of the classic Asch conformity paradigm1. Subjects played with new Player As and Bs on each trial. To test the possibility that endorsement rates of the punitive Reverse option would increase as the number of punishers within the group increased, we parametrically and deterministically varied the proportion of players who selected the punitive option from 0% to 100%.

To examine whether a group’s punitive preferences alter an individual’s willingness to punish, we conducted a mixed-effects logistic regression analysis. We modeled the probability of punishing as an additive combination of the proportion of punishers in the group (centered around 50% punishers), the unfairness of Player A’s offer (centered around medium unfairness), and each subject’s baseline preference for punishment (i.e., the proportion of Solo Phase trials in which a subject chose to Reverse instead of Compensate, matched for offer unfairness and centered around 50% punishment). Results reveal that individuals significantly increase punishment as a greater proportion of the group expresses a punitive preference, even after accounting for offer unfairness and baseline punitive preferences (Table 1; Fig. 2). This effect was so strong that the group was ultimately able to shift individuals’ predicted punishment rates as much as 30 percentage points.

Table 1 Experiment 1: Last Decider. Full size table

Figure 2 Results for Experiments 1 (Last Decider) and 2 (First Decider). For both experiments, preference for punishment increases with the proportion of punishers. This effect remains significant, albeit is reduced, when subjects are observing the moral preferences of past groups who cannot affect the current choice (First Decider). Error bars reflect ± 1 SEM. Full size image

Experiment 2: Influence of past groups’ punishment preferences as First Decider

The findings from Experiment 1 are the first that we are aware of illustrating that individuals’ moral attitudes about punishment vary with the intensity of a group’s preference. To examine the robustness of this conformity effect and the potency of social influence, we conducted a second experiment examining the possibility that individuals might even conform to previous groups endorsing punishment. Specifically, we tested two competing hypotheses: whether the effect of social influence is abolished as soon as a past group’s preferences are no longer relevant to the current decision context1,34, or whether knowing that a past group sanctioned punishment would also affect how readily an individual chooses to punish in the present moment. Evidence for the second hypothesis would suggest that decisions to punish are highly malleable. To probe whether subjects’ punitive behaviors are susceptible to influence even from past groups’ preferences, subjects in Experiment 2 were first in their group to decide during the Group Phase (i.e. playing as the “First Decider”, in contrast to Experiment 1, in which subjects played as the “Last Decider”).

The methodology of Experiment 2 was identical to that of Experiment 1, with one key difference: the sequence of Player Bs in the Group Phase was fixed so that subjects were always the first person in the group to make their decision (Fig. 1c). When deciding first, subjects did not have access to their present group’s punitive preferences. Therefore, on any given trial, the only two pieces of information that could affect a subject’s decision were the unfairness of Player A’s offer and the preferences of past groups. Given that behavioral economic experiments examining fairness norms have found that people are sensitive to the magnitude of fairness violations, including in the Justice Game33, we defined the proportion of punishers in a way that would allow us to match fairness violations across trials. We employed a simple analysis approach such that on each trial, we recorded Player A’s fairness violation (from mildly unfair splits of $6/$4 to highly unfair splits of $9/$1), searched backwards until we found the most recent past trial in which the severity of the fairness violation matched that of the present trial, and then used the proportion of punishers observed on that past trial to predict decisions to punish on the current trial. Alternative models with different operationalization of trial history are discussed in the Supplementary Results; we report the best-fitting model here.

This task design also allows us to address two potential concerns about the validity of our paradigm: first, that explicit pressure to conform in Experiment 1 encouraged subjects to behave in ways they believed would please the experimenter, and second, that the observed conformity effect is only observable under conditions where people believe that their individual decision has little influence on the outcome of the group’s final decision. Therefore, in addition to testing whether punishment decisions remain susceptible to past groups’ influence, the task design of Experiment 2 also allowed us to determine whether the conformity effect replicates in a decision context where explicit pressure to conform is reduced and where subjects explicitly know their choice can affect the final outcome.

As in Experiment 1, results indicate that increasing the proportion of punishers within a group significantly enhances punishment endorsement rates, even when accounting for offer unfairness and baseline punitive preferences (Table 2; Fig. 2). Although the strength of the conformity effect was attenuated relative to when individuals were last to decide, our results illustrate that individuals still conform when their decisions are consequential and when they are relatively free from experimenter demand effects. An ancillary analysis also demonstrates our results are not significantly modulated by the time elapsed since the key trial used to define the proportion of punishers (Supplementary Results).

Table 2 Experiment 2: First Decider. Full size table

Experiment 3: Drift diffusion model of conformity as a Victim

Experiments 1–2 reveal just how powerfully social influence can alter individuals’ endorsement of punishment as a means of restoring justice. However, it remains an open question how social influence operates on the cognitive mechanisms underlying decisions to punish norm violators, thereby producing conformist behaviors. To explore this question, we performed a third experiment leveraging the Drift Diffusion Model (DDM), a computational model of decision-making. Though the DDM is typically used in perceptual decision-making contexts, it is well-suited for examining the effects of social influence, and has recently gained some traction within the social domain, successfully explaining how social groups bias perceptual judgments and even how altruistic decisions unfold35,36.

In the context of our task, DDM uses choices and reaction time distributions to characterize how people integrate evidence about the value of punishment (such as the group’s punitive preference) relative to compensation37. This allows us to decompose the decision-making process into psychologically-meaningful parameters. First, bias (z) quantifies the extent to which people lean towards one of the justice restoration options prior to observing any evidence. Second, decision threshold (a) indexes the amount of evidence subjects need in order to make a choice, thereby capturing how cautiously individuals make choices in the presence (and absence) of group influence. Third, drift rate (v) indexes the strength of evidence favoring either punishment or compensation obtained from observing the group’s preference, and therefore how much an individual weighs the value of punishment relative to the value of compensation36,38. Importantly, the strength of evidence in our task is not dependent on the dynamics of the stimulus (such as movement coherence in random dot motion tasks), but instead reflects the psychological process through which a group’s punitive preferences are dynamically integrated into an individual’s valuation of punishment.

Therefore, DDM enables us to rigorously test two longstanding hypotheses from social psychological theory about how group influence acts on the cognitive mechanisms governing decision-making. First, we can test whether the presence of a group majority lowers the stakes of an individual’s decision (as the subject is now merely one vote out of many), thereby reducing the total amount of evidence required to commit to a choice and leading the individual to relinquish moral responsibility. Second, we can test whether the proportion of people within a group endorsing a punitive option increases the strength of evidence that punishment is a valuable method for restoring justice.

Given that the greatest conformity effects occur when subjects are the last to decide within the group, Experiment 3 followed the same task structure as Experiment 1 (Last Decider), with a few key modifications to make the task suitable for DDM. First, because the best-established DDMs only account for binary choices, subjects were only presented with the Compensate and Reverse options on each trial. Second, subjects sometimes made decisions in the absence of information about others’ preferences during the Group Phase of the task (hereafter referred as the Alone condition), which allows us to avoid ordering confounds from the Solo Phase that might contaminate DDM parameters (e.g. from unrelated motor or task learning). Third, offers from Player As were placed within discrete monetary bins and jittered to offset task habituation from repeatedly seeing only four distinct offer types (as in Experiments 1–2), which was especially important given the large number of trials needed to estimate DDM parameters. Therefore, offers from Player As were binned into three levels of unfairness, all drawn from a uniform distribution in increments of 10¢: Mildly Unfair offers between $3.70 and $4.90, Somewhat Unfair offers between $1.90 and $3.10, and Highly Unfair offers between $0.10 and $1.30. Finally, because DDM requires reaction time (RT) distributions to capture the decision process in its entirety37, subjects were simultaneously presented with Player A’s offer and four randomly-sampled responses from other Player Bs (Fig. 3a,c; this is in contrast to Experiments 1–2, in which subjects observed Player A offers and Player B choices sequentially). Subjects were free to make a response at their own pace, but were encouraged to make decisions as quickly as possible.

Figure 3 Schematics and behavioral results of Experiments 3 and 4 (Victim and Juror). (a) As in the Last Decider study, subjects in Experiments 3 and 4 observed their group’s punishment preferences prior to indicating their own punishment preference. In this example, the group is evenly split between punishers and compensators. (b) According to the Drift Diffusion Model (DDM), people make decisions by noisily accumulating evidence in favor of each of the options until a decision bound is reached. In this cartoon schematic, subjects are biased such that they initially prefer compensation over punishment, indicated by the starting point z; they require some threshold of evidence in order to commit to a choice, indicated by a; and they accumulate evidence in favor of compensation, indicated by v. Together, these parameters dictate the shapes of reaction time distributions for punishment and compensation. (c) In this example trial, the group unanimously prefers to punish the unfair offer. (d) We illustrate using a cartoon schematic how conformity to the group underlie shifts in the v and a parameters, thereby changing choice and reaction time distributions. (e) When deciding as either a Victim or Juror, preference for punishment increases with the proportion of punishers, replicating Experiment 1 (Last Decider). Error bars reflect ± 1 SEM. Full size image

Replicating the behavioral findings from Experiment 1, results indicate that the group’s punitive preferences significantly affects endorsement of punishment (Table 3; Fig. 3e), shifting individuals’ punitive choices by as much as 40 percentage points. We then used the HDDM software package to perform hierarchical Bayesian estimation of DDM parameters39. The drift rate (v), threshold (a), and bias (z) parameters were estimated at the group level using a hierarchical Bayesian procedure. This allows individuals’ contributions to the group parameters to be weighted according to their diagnostic value, maximizes power by capitalizing on statistical similarities between subjects, and addresses collinearity between variables by incorporating greater uncertainty in the posteriors of parameter estimates. To additionally account for within-subject variability, the proportion of punishers was regressed onto the DDM parameters of interest. Three regressions were performed, one for each bin of offer unfairness. Decisions to punish were mapped to the upper boundary, and decisions to compensate were mapped to the lower boundary (Fig. 3b). Separate parameters were fit for each bin of offer unfairness. Drift rate and threshold were allowed to vary by the proportion of punishers in the group, whereas a single bias parameter was estimated for each type of unfairness. This model specification reflects that people likely have preexisting biases about how much punishment is warranted depending on the degree of fairness violations, while making no claims that preexisting bias varies according to group dynamics. Alternatively-specified models are tested and discussed in the Supplementary Information.

Table 3 Experiment 3: Punishing as Victim. Full size table

Statistical significance in the context of Bayesian estimation is determined by the proportion of values in a posterior distribution that fall above or below a point value, such as zero (i.e., testing whether a regression coefficient is different from zero), or the mean of another posterior distribution. Bayesian hypothesis testing is therefore conceptually akin to performing a frequentist t-test40. We define significance as 95% of posterior values falling above (or, depending on the analysis, below) the specified point value. To avoid confusion, we report 1–p, as this statistic more closely resembles p-values from frequentist significance testing.

First, given past findings from the Justice Game that Victims typically prefer to compensate in the wake of a fairness violation33, as well as our behavioral data from Experiments 1–2, we predicted that Victims would exhibit an overall bias in favor of choosing the compensatory option over the punitive one. This would be reflected by the bias (z) parameter being closer to the decision boundary for compensating (see Fig. 3b for a schematic). Values greater than 0.5 indicate an initial preference for punishment, and values less than 0.5 indicate an initial preference for compensation. Indeed, we found that Victims exhibit a preference for compensation over punishment, reflected by 95% of posterior z estimates falling under the point value 0.5 (average z for Mildly Unfair = 0.42, Somewhat Unfair = 0.43, Highly Unfair = 0.44, all posterior Ps < 0.001). This indicates that highly-punitive groups were able to influence Victims’ behaviors despite individuals having a predisposition not to punish. This preference for compensation did not significantly vary as a function of offer unfairness (all posterior Ps > 0.10).

Second, we predicted that individuals would be less cautious within groups where a majority of group members endorse punishment or compensation, as the individual would only be one vote of many. This would be reflected in a decrease in the distance a between decision thresholds (Fig. 3d). Consistent with our hypothesis, results reveal that Victims have a lowered average decision threshold when making decisions within a group majority, relative to a group that is evenly split between punishers and compensators (all posterior Ps < 0.001; Fig. 4). We additionally find that individuals are generally less cautious and more impulsive when choosing within a group majority, relative to when they are choosing alone (Fig. 4). Average a did not differ as a function of offer fairness (all posterior Ps > 0.40).

Figure 4 Mean threshold parameter estimates for Victim and Juror DDM experiments. Significance asterisks above/below each bar represent differences from the Alone condition (which was used as the reference category and is represented by a grey dashed line in each panel), and lines indicate pairwise differences. Threshold is estimated to be higher for decision contexts in which subjects determine the final outcome (i.e., choosing alone and with evenly-split groups). Full size image

Third, because greater group endorsement of the punitive option provides individuals with stronger evidence favoring punishment, we predicted that drift rate (v) would increase as the proportion of punishers in the group increased. The magnitude of v indicates the strength of evidence favoring each option, while the sign of v indicates whether the evidence favors punishment (positive) or compensation (negative; see Fig. 3b,d for a schematic). Consistent with subjects’ average preference for compensation, drift rates are largely negative. However, v increases with the proportion of punishers, demonstrating that Victims used the proportion of punishers as accumulating evidence of punishment’s value (Fig. 5). Average v did not differ as a function of offer fairness (all posterior Ps > 0.05).

Figure 5 Mean drift rate parameter estimates for Victim and Juror DDM experiments. Significance asterisks above/below each bar represent differences from the Alone condition (which was used as the reference category and is represented by a grey dashed line in each panel), and significance bars indicate pairwise differences. Drift rates grow increasingly positive with the proportion of punishers, indicating that people accumulate evidence favoring punishment from groups. Full size image

Experiment 4: Drift diffusion model of conformity as a Juror

Experiment 3 replicates our finding that Victims’ punishment decisions are susceptible to the preferences of a group. In addition, these findings demonstrate that groups act upon the decision-making process by making individuals less cautious about their choices, and by increasing how much individuals value punishment as a method of restoring justice. However, a potential alternative interpretation of these results is that individuals are acting upon an existing desire to punish and are thus not conforming to a group’s punitive preferences. That is, Victims may be reluctant to punish norm violators because they believe that retribution is perceived as being socially undesirable; observing others being punitive would then enable them to act upon their latent desire to punish. To distinguish between these alternative interpretations—and to examine whether people also conform to a group’s punitive preferences when tasked with making third-party punishment decisions19,20,30,31—we conducted a fourth experiment that was identical to Experiment 3 with the major exception that subjects made punishment decisions as Jurors on behalf of victims.

Behavioral results indicate that the group’s punitive preferences significantly modulates punishment (Table 4), shifting predicted punishment by nearly 30 percentage points. A formal comparison between Experiment 3 (Victim) and Experiment 4 (Juror) additionally finds that the conformity effect is significantly weakened for Jurors relative to Victims (β = 0.50, SE = 0.13, z = −2.58, p = 0.010; estimate and SE are in units of odds ratios).

Table 4 Experiment 4: Punishing as Juror. Full size table

In contrast to Victims (who were consistently biased in favor of compensation), DDM results reveal that Jurors were only biased in favor of compensation when fairness violations were Mildly Unfair (average z = 0.45, posterior P < 0.001). When offers were instead Somewhat or Highly Unfair, z was not statistically different from the neutral starting point 0.5 (average z for Somewhat Unfair = 0.48, Highly Unfair = 0.48, both posterior Ps > 0.05). This reveals that before observing the group’s preferences, Jurors were more neutral than Victims (i.e., not biased towards either compensation or punishment).

Replicating the pattern observed for Victims, Jurors exhibited a lowered decision threshold when choosing in a group majority than when choosing in an evenly-split group (all posterior Ps < 0.001), and generally when choosing in a group majority relative to choosing alone (with a noticeable exception for when a majority of the group chose to punish highly unfair offers; Fig. 4). Although the pattern of threshold estimates between Victims and Jurors appear to be qualitatively different upon visual inspection (Fig. 4), direct statistical comparisons of the a parameter reveals no significant difference between Victims’ and Jurors’ decision criterions when comparing average a for each bin of offer unfairness (all posterior Ps > 0.20). Average a did not differ as a function of offer fairness for Jurors (all posterior Ps > 0.30), as was the case for Victims.

While Jurors’ drift rates were found to increase as the proportion of punishers in the group increased, echoing the pattern found when Victims decided the outcome (Fig. 5), results also reveal interesting differences between Victims and Jurors in how strongly groups provide compelling evidence for punishment. Whereas Victims’ average v did not differ depending on offer unfairness, Jurors’ average v significantly increased as offers became increasingly unfair (both pairwise posterior Ps < 0.05). To further probe this relationship, we fit a variant model comparing Victims and Jurors, where the number of punishers within the group was treated as a continuous variable (see Supplementary Results). Results from this model indicate that the intercept of v does not significantly differ depending on offer unfairness for Victims (all posterior Ps > 0.1), but significantly increases as offers become more unfair for Jurors (all posterior Ps < 0.05; Fig. 6a). In other words, the degree of unfairness does not influence Victims’ valuation of punishment, whereas Jurors incorporate this information into their decisions, placing higher value on punishment as unfairness increases. Additionally, the regression betas for the number of punishers are significantly greater for Victims than Jurors (all posterior Ps < 0.001; Fig. 6b), meaning that each additional punisher in a group provides stronger evidence that punishment is valuable when one is an affected Victim than when one is an impartial Juror. These two patterns are also reflected in the behavioral data (Fig. 3e).

Figure 6 Effects of fairness violation severity on Victims’ and Jurors’ decisions to punish. (a) As indicated by the drift rate intercept, Victims do not distinguish between differing levels of offer unfairness, whereas Jurors do. (b) As shown by the drift rate betas, each additional punisher in the group provides stronger evidence that punishment should be favored, more so for Victims than Jurors. Full size image

Experiment 5: Crime judgments

Taken together, Experiments 1–4 demonstrate robust group influence on an individual’s punitive behavior, regardless of whether one is deciding as a Victim or Juror. However, our use of an economic game paradigm to precisely measure behavior leaves open an important question: is a group’s influence on an individual’s desire to punish limited to fairness violations, or does social influence act more generally upon a variety of moral violations? In order to answer this question, we performed a fifth experiment to examine whether a group can shift an individual’s judgments of how severely perpetrators should be punished for committing different types of crimes.

Subjects read short vignettes describing either a physical assault or a theft. The crime type (assault vs theft) was crossed with two levels of crime severity: high-intensity crimes involved the use of weapons, whereas low-intensity crimes did not. All vignettes featured an unambiguous perpetrator and victim. For subjects who completed the Victim condition, vignettes were written using second-person pronouns, and subjects were asked to imagine that they were the victim of the crime. The exact same vignettes were presented in the Juror condition, but instead of using second-person pronouns, the victims were instead unrelated strangers.

After reading the vignette, subjects were asked to rate how severely the perpetrator should be punished on a 100-point scale, where 0 corresponded to “Mild Punishment” and 100 corresponded to “Severe Punishment” (Fig. 7a). On Alone trials, subjects made their judgment in the absence of any social information. On Group trials, subjects were shown four icons that represented the responses of four past participants (though in reality, all responses were experimenter-generated to fully parameterize the decision space). The icons indicated the proportion of the group that had previously endorsed severe punishment as opposed to mild punishment.

Figure 7 Change in punishment severity judgments for Victims and Jurors. (a) Schematic illustrating sample trials in which subjects are choosing alone or within a group. Vignettes were written in the second-party perspective in the Victim condition, and in the third-party perspective in the Juror condition. (b) Overall, the proportion of group members endorsing severe punishment increases subjects’ judgments for how severely a perpetrator should be punished for committing a crime. The midpoint of the scale is indicated using a grey dashed line. Error bars reflect ± 1SEM. Full size image

To examine whether a group’s endorsement of punishment alters an individual’s judgments about the severity of punishment warranted by real-world crimes, we performed a linear mixed-effects regression, modeling judgments as an interaction between the proportion of punishers in the group (centered around 50% punishers), the type of crime (assault vs theft), and crime intensity (centered around medium intensity), and a covariate for subjects’ baseline preference for punishment (mean-centered and standardized). The overall pattern of results reveals that the proportion of group members endorsing severe punishment significantly modulates subjects’ judgments of how severely crimes should be punished (Fig. 7b), both when they are Victims (Table 5) and Jurors (Table 6). A formal comparison between Victims and Jurors further reveals that the overall conformity effect is significantly stronger for Jurors than for Victims (β = 2.06, SE = 0.69, z = 3.00, df = 224.00, p = 0.003), unlike our findings from Experiments 3–4. This divergence may be due to differences in moral decision-making when choices are hypothetical41,42, or may alternatively be due to the fact that the vignettes used in Experiment 5 involve moral transgressions that are more severe than the fairness violations in Experiments 1–4.

Table 5 Experiment 5: Punitive Judgments as a Victim. Full size table