What’s the science?

We use feedback from rewards every day to learn new things. For example, if we are offered a mango and we have enjoyed several mangos previously, over time we might learn to favor mangos. Research on the neural underpinnings of this form of reward-based learning typically focuses on short-term learning (across several minutes). However, we don’t know what happens when learning occurs over several weeks time, as it might in many everyday situations. Gradual learning from reward feedback relies on a dopaminergic system in the brain, but short-term learning paradigms in typical experiments may also rely on short-term (‘working’) memory systems. This week in the Journal of Neuroscience, Wimmer and colleagues used behaviour and functional magnetic resonance imaging (fMRI) to understand the mechanisms underlying reward learning over a period of several weeks versus within a single session in humans.

How did they do it?

The authors completed two similar studies for replication purposes. In the first study, 33 participants completed a behavioural and fMRI experiment, while in the second study, 31 participants completed a behaviour-only experiment. In both studies, the participants' task was to learn whether the best response to a stimulus (scenes presented on a screen) was either ‘Yes’ or ‘No’ (the wording was arbitrary). The stimuli had been randomly assigned by the experimenter as either reward-associated or loss-associated. Reward-associated stimuli resulted in the participant winning $0.35 for ‘Yes’ (on average) and losing $0.05 for ‘No’. Loss-associated stimuli resulted in the participant losing $0.25 when ‘Yes’ was selected and gaining $0.0 when ‘No’ was selected. Feedback was given after each trial, and feedback was probabilistic, meaning there was an 80% chance that the best response would result in the best outcome/payment. In an initial learning session in the lab, participants learned about 8 'spaced' stimuli. Next, three learning sessions for the ‘spaced’ stimuli were done online (over ~ two weeks). In a lab session about two weeks later, participants learned about 8 new 'massed' stimuli for the same number of times as the previously seen 'spaced' stimuli.