Apparently, the act of free choice confers value: when selecting between an item that you had previously chosen and an identical item that you had been forced to take, the former is often preferred. What could be the neural underpinnings of this free-choice bias in decision making? An elegant study recently published in Neuron suggests that enhanced reward learning in the basal ganglia may be the culprit.

Would you prefer the chicken or the pasta entrée? And what would you think of an airline that offers only one of these entrées? Obviously, all else being equal, you would prefer the airline that gave you a choice. After all, that can result in more enjoyment of your in-flight meal. But what if the airline had studied your preferences and offered you the single entrée you would have chosen anyway, had you flown with the competitor? Would you have a preference between the two airlines? And in what situation would the pasta be tastier (if this adjective is ever applicable to what airlines call pasta)?

1 Cockburn J.

et al. A reinforcement learning mechanism responsible for the valuation of free choice. In an article recently published in Neuron, Cockburn et al. (2014) [] report the results of an experiment that created just such a scenario: participants learned the value of each of six different options through repeated choices between pairs of options. Intermingled with these were trials in which participants were shown a pair of options drawn from a separate, visually distinct set of six stimuli, with one of the options marked as the one that must be chosen. Importantly, Cockburn and colleagues yoked the trials such that each choice from the first set of options was exactly replicated in the second set. Thus, by any simple learning model, the learned values of each ‘free choice’ option and its corresponding ‘forced’ option would be identical. However, when given a choice between the two in a subsequent test phase, participants showed a clear preference for the option they had freely chosen. This pattern was found for all three positive-value stimuli, but not for stimuli whose choice led to losses more often than to gains.

2 Sharot T.

et al. How choice reveals and shapes expected hedonic outcome. 3 Sharot T.

et al. Do decisions shape preference? Evidence from blind choice. 4 Leotti L.A.

Delgado M.R. The inherent reward of choice. 5 Leotti L.A.

Delgado M.R. The value of exercising control over monetary gains and losses. A preference for freely chosen options had been demonstrated before [], but not as elegantly. For instance, it is not sufficient to show that when choosing between two equally valued options, the chosen option is subsequently evaluated as better than its competitor. It is conceivable that choices spur us to resolve value more precisely; therefore, the subsequent preference for the previously selected option is possibly not due to a free-choice bias, but rather because this option was inherently more valuable, but its exact value was only resolved at the time of choice. Cockburn et al.’s experimental design successfully avoided this, and several other, possible confounds.

6 Festinger L. A Theory of Cognitive Dissonance. 7 Collins A.G.E.

Frank M.J. Opponent actor learning (OpAL): modeling interactive effects of striatal dopamine on reinforcement learning and choice incentive. Other than a perplexing case of cognitive dissonance [], what could be driving the preference for freely chosen options? Cockburn et al. used a computational model of reinforcement learning couched in basal ganglia anatomy and physiology [] to suggest that free choice enhances learning from rewards. The idea is that when an option is freely chosen, unexpected rewards cause stronger long-term potentiation (LTP) of ‘go’ (direct) pathway striatal neurons and more long-term depression (LTD) of ‘no-go’ (indirect) pathway neurons, thus promoting stronger propensity to choose the same option in the future, as compared to a similarly unexpected reward obtained for a forced choice.

8 Krajbich I.

et al. Visual fixations and the computation and comparison of value in simple choice. Cockburn et al. further suggested that the enhancement is due to amplification of positive reward prediction error signals conveyed by dopamine neurons in the substantia nigra pars compacta, which modulate corticostriatal learning in a three-factor learning rule. However, the implementation of this mechanism in the model – by multiplying by some factor the size of the update in free-choice trials that led to a positive prediction error – is also consistent with other alternatives. Specifically, the three-factor learning might be enhanced due to stronger presynaptic cortical activity (e.g., due to enhanced attention, and thus representation, of the options in the free-choice trials) or stronger post-synaptic striatal activity (e.g., due to random fluctuations of the value of different options [] – valuation noise is likely to be positive conditional on the option being freely chosen, whereas in the identical forced choice trial it is, on average, zero). The latter explanation would predict a stronger free-choice bias when choosing between options that initially had a low value, however, the empirical results showed that the bias was pronounced only for high-valued options. By contrast, the former explanation accords with the intuition that participants pay more attention (to stimuli, to outcomes) on trials in which they have to earn rewards by making correct choices, as compared to trials in which all they do is follow instructions. Therefore, it is not necessarily prediction errors that are enhanced in rewarded, free-choice trials. It could also be activity in striatal neurons or their cortical afferents.

9 Doll B.B.

et al. Dopaminergic genes predict individual differences in susceptibility to confirmation bias. In addition to the behavioral and modeling results, Cockburn et al. also genotyped their participants, focusing on three dopamine-related genes []. They found a significant interaction between the pattern of preferences for freely chosen options and DARPP-32 genotype: participants with one variant of this gene preferred the most highly rewarding (80%) freely chosen options most strongly, with the preference diminishing as reward probability decreased. By contrast, those without this genotype showed strongest preference for the freely chosen option that was associated with intermediate reward probabilities (60%), with this preference diminishing as reward probability increased. Model simulations showed that this interaction could be the result of a different balance between learning in the ‘go’ as compared to the ‘no-go’ pathway in the two populations. These results, therefore, tie preferences for freely chosen options to learning in the basal ganglia. However, given that the genetic variation was likely not due to group differences in the free-choice enhancement of learning, the genetic results could not further illuminate the precise implementation of the free-choice bias.

Whether the free-choice learning bias results from dopaminergic or other mechanisms remains to be resolved. In the meantime, next time you want a child (or a student) to be happy with their toy (or project), try to set things up so that they can choose it ‘freely,’ even if in reality there is only one viable option. We’ve seen it done, and it works.

Acknowledgments The authors’ work was supported by the Human Frontiers Science Program Organization (Y.N. and A.L.) and award number R01MH098861 from the National Institute for Mental Health (Y.N. and A.R). We are grateful to Jeffrey Cockburn, Anne GE Collins, and Michael J Frank for illuminating discussions.