Significance A central tenet of economics is that as the price of a commodity increases, its demand goes down because individuals choose to buy less. Mounting evidence supports a role for the neuromodulator dopamine in representing subjective value. We investigated the role of dopamine in valuation by presenting rats with a reward across a range of prices. We showed that dopamine concentration decreased with price and increasing release using optogenetic manipulations-altered price sensitivity. Increasing release prior to reward delivery made animals more sensitive to price, whereas increasing release at reward delivery made animals less sensitive to price. These data extend the notion that dopamine release events encode subjective value and further demonstrate that increasing dopamine release causally modifies price sensitivity.

Abstract The mesolimbic dopamine system is strongly implicated in motivational processes. Currently accepted theories suggest that transient mesolimbic dopamine release events energize reward seeking and encode reward value. During the pursuit of reward, critical associations are formed between the reward and cues that predict its availability. Conditioned by these experiences, dopamine neurons begin to fire upon the earliest presentation of a cue, and again at the receipt of reward. The resulting dopamine concentration scales proportionally to the value of the reward. In this study, we used a behavioral economics approach to quantify how transient dopamine release events scale with price and causally alter price sensitivity. We presented sucrose to rats across a range of prices and modeled the resulting demand curves to estimate price sensitivity. Using fast-scan cyclic voltammetry, we determined that the concentration of accumbal dopamine time-locked to cue presentation decreased with price. These data confirm and extend the notion that dopamine release events originating in the ventral tegmental area encode subjective value. Using optogenetics to augment dopamine concentration, we found that enhancing dopamine release at cue made demand more sensitive to price and decreased dopamine concentration at reward delivery. From these observations, we infer that value is decreased because of a negative reward prediction error (i.e., the animal receives less than expected). Conversely, enhancing dopamine at reward made demand less sensitive to price. We attribute this finding to a positive reward prediction error, whereby the animal perceives they received a better value than anticipated.

Our understanding of the role dopamine plays in the motivation to act has evolved over several decades. It was first demonstrated that neurotoxic lesions of mesolimbic dopamine fibers (1) and pharmacological antagonism of dopamine receptors (2) impair reward-seeking actions without disrupting general motor activity (3). Then, in vivo electrophysiological recordings demonstrated that bursts of dopamine neural activity occur when animals are presented with an unexpected reward or a reward predictive stimulus, but are suppressed when the reward is withheld (4). This observation led to the development of the reward prediction error theory, which suggests that a transient dopamine signal encodes the reward prediction signaled by the cue. Electrochemical studies generally confirmed this theory by demonstrating that transient accumbal dopamine release events occur when animals are presented with rewards and their conditioned predictors (i.e., cues), but suppressed during reward omission (5, 6). More recently, optogenetic manipulations, which can be used to assess the causal relationship between patterns of neural activity and behavior, were used to demonstrate that augmenting dopamine release at reward delivery accelerates reward learning (7).

At present, these observations are being reconsidered within the context of economic theory. It has been proposed that the transient dopamine signal represents subjective value (8). This revision is supported by electrophysiological and electrochemical studies demonstrating that increases in reward magnitude augment both the phasic activation of midbrain dopamine neurons and transient accumbal dopamine concentrations (6, 8). Similar results have been reported when price is modified by manipulating effort (6), but they remain controversial (9). Furthermore, a subpopulation of accumbal neurons that receives dopaminergic input has been shown to represent effort-based costs (10). Recent optogenetics experiments have also confirmed a causal role for dopamine in valuation by demonstrating that transient dopamine manipulations alter the willingness to work for a reward (11).

In this study, we use a behavioral economics approach to investigate the relationship between dopamine and price. Behavioral economics has historically been used in the fields of behavioral analysis (12) and psychopharmacology (13, 14), and more recently to study the relationship between dopamine and valuation (8, 15⇓⇓–18). An elegant body of electrophysiology studies demonstrated that dopamine neurons respond to gambles and outcomes to guide economic decision making (19), and are able to integrate various factors that underlie value representations to influence economic choices (20). As various factors can contribute to subjective value, including risk (20), satiety (21, 22), and delay (23, 24), we started by focusing on the most obvious and easiest to measure: the unit price of a commodity—generally defined as the response requirement per unit reward (13). In addition to characterizing the relationship between price and dopamine concentration, our study builds upon the existing literature by assessing how optical manipulation of dopamine neurons causally influence price sensitivity using demand curves.

Price sensitivity can be experimentally determined using demand curves that plot the relationship between price and consumption (14). Typically, consumption is inversely related to price, resulting in demand curves with a negative gradient. The rate at which the demand curve decays is a measure of price sensitivity or, in economic terms, the elasticity of demand (14). If an animal’s demand for a commodity becomes more sensitive to price, the demand curve would decay at a faster rate. From this, we would conclude that the value the animal places on the commodity is diminished. If the revised value-based theory of reward prediction error is correct, then the transient dopamine response should be sensitive to price and modulating dopamine release should alter the rate at which demand curves decay.

We predicted that dopamine would scale in an inversely proportional manner to unit price, irrespective of the order in which unit prices are presented. We further predicted that augmenting dopamine release at cue would make demand more sensitive to price because of a negative reward prediction (i.e., the animal receives a worse value than expected). Conversely, we predicted that augmenting dopamine release at reward delivery would sustain demand at higher prices because of a positive reward prediction (i.e., the animal receives a better value than expected).

Discussion In the present study, we examined the role of dopamine in value assessments using a combination of behavioral economic theory, in vivo electrochemistry, optogenetics, and modeling. Two tasks were used: a cost-manipulation task and a reward-manipulation task. We found that dopamine concentration decreased as price increased in both behavioral economics-based tasks (Fig. 1). A divisive ratio unit price model (response requirement/milligrams of sucrose) was used as opposed to a subtractive model (response requirement – milligrams of sucrose) because the ratio price model provides the definition of unit price (cost/unit of good). The inverse relationship between dopamine concentration and price was observed regardless of the order of sequential price presentation. These data confirm and extend the notion that dopamine release events originating in the VTA encode subjective value (8). Our optogenetics data revealed that augmenting release at cue presentation increased sensitivity to price and decreased dopamine release at reward delivery. In contrast, augmenting release at reward delivery reduced sensitivity to price. Therefore, we further conclude that transient dopamine release events originating in the VTA causally modify valuation. The mesocorticolimbic system projects to various terminal fields including the orbitofrontal cortex, dorsal striatum, NAcc, and olfactory tubercle—all of which have been implicated in either valuation or motivation (30⇓⇓⇓⇓–35). As we observed accumbal dopamine concentration decreases with price, we also investigated the effects of bilaterally stimulating terminal release in the core region of the NAcc. We observed trends that were consistent with VTA stimulation, but the accumbal stimulation effects were weaker in the reward-manipulation task—confirming a recent study by Saddoris et al. (36). These data suggest that the NAcc is involved in valuation, but likely not exclusively. The current study builds upon a growing body of literature that is reconsidering the influential theory of reward prediction error in the context of economic utility. Reward prediction error theory suggests that a transient dopamine signal encodes the discrepancy between a reward and its predictive cue (4, 8). A series of choice studies demonstrate that transient dopamine release events encode value as positive or negative reward prediction errors (6, 9, 37). The data presented herein generally support the basis of reward prediction error and its role in economic utility. Dopamine concentration at cue presentation and reward delivery decreased as price increased. Furthermore, increasing dopamine release at cue presentation rendered animals more sensitive to price and decreased dopamine concentration at reward delivery, consistent with a negative reward prediction error. In this case, we infer that the subjective value of sucrose is decreased because the animal perceives that they received less than expected. Conversely, augmenting release at reward presentation rendered animals less sensitive to price, consistent with a positive reward prediction error. In this case, we infer that the subjective value of sucrose is increased because the animal perceives that it is a good bargain to receive more than expected. However, it should be noted that dopamine concentration at cue and reward never reached zero when action was maintained in the task. This observation may suggest that the positive and negative prediction errors encountered in this task center around subjective value rather than a zero baseline. These results are also relevant in the context of surprisal error (38), a central component in hierarchical models of brain function and the free-energy principle. According to this principle, intelligent agents act to minimize surprise (39⇓–41). In this context, the dopamine prediction error signal is a surprise signal, and the content of the surprise is subjective value. Thus, the dopamine signal reflects a prediction error in subjective value—or, more precisely, a utility prediction error. We also found that increasing dopamine release at either cue presentation or reward delivery decreased response latency—consistent with its role in invigoration (5, 42). Regardless of how valuation changed, optically amplifying transient dopamine release events reduced response latencies. Taken together with our demand curve analyses, we conclude that this invigorating role is dissociable from the role of dopamine in valuation. Another important advancement in this study is our use of mathematical modeling of demand curves coupled with a comprehensive statistical analysis to quantify the causal effect of dopamine in valuation. While an elegant body of electrophysiological studies have used behavioral economic theory to assess the role of dopamine neural firing in valuation (19, 20), the current study builds upon this work by providing a formal demand analysis during both electrochemical and optogenetic assessments of dopamine release. Demand curves are a common tool used by economists to measure price sensitivity. Assessing the role of dopamine in valuation using demand curve analysis is a logical progression following a recent study demonstrating that dopamine manipulations alter the valuation of work (27). As in the study by Hamid et al. (27), we found that transient dopamine release events are involved in adaptive value assessments, and that augmenting this signal alters valuation. We also confirm that optical stimulation of dopamine neurons decreases response latency for reward (27), but reached a distinct conclusion following our demand analysis. Despite decreasing response latency, optically increasing dopamine release at cue made animals more sensitive to price. From these observations, we conclude that heightened release at cue both invigorates behavior and leads to a negative reward prediction error due to a mismatch between the value of reward predicted and the value of reward received. This negative mismatch contributes to a decrease in subjective value. We believe this dissociation lends credence to the notion that transient dopamine release events play multiple roles in motivated behavior rather than functioning as a single uniform motivational signal (43). Several future directions and alternative interpretations should be considered. While the present study demonstrates that optical activation of dopamine neurons modifies valuation, it remains unknown whether inhibiting dopamine neurons produces diametrically opposite effects. It is possible that decreasing release fails to alter valuation, which would suggest that, while dopamine is sufficient to modify valuation, it may not be necessary for valuation to occur. Additional investigation into how optical manipulation of dopamine neurons alters valuation using simpler behavioral designs should also be performed and compared with the results from demand analysis. The role of contingency degradation is another important consideration. Previous experiments demonstrated that providing unsignaled rewards during an experimental session decreases reward seeking through a devaluation process that requires the core region of the NAcc (44, 45). In theory, optical stimulation of dopamine neurons could function similarly to the receipt of unsignaled sucrose—either could ultimately devalue the final reward the animals are working for. This issue would be particularly concerning with long and/or more frequent patterns of dopamine neuron stimulation than those used in the present study (10 pulses at 20 Hz, 0.5-s duration). Future studies are needed to determine how dopamine interacts with other neurotransmitters within a neural network to encode and control valuation. Using the current approach, we can also assess how dopamine release and economic demand for reward are altered in animal models of psychiatric disease. In conclusion, our results suggest that a transient dopamine value signal encodes subjective value and is capable of modifying the worth an animal places on a desired commodity.

Methods Subjects and Surgery. Male Long–Evans rats supplied by Charles River Labs and transgenic rats [LE-Tg(TH-Cre)3.1Deis] expressing the Cre-recombinase protein under control of the tyrosine hydroxylase (TH) promoter (Th::Cre+/−) (46) supplied by Rat Resource and Research Center (300–350 g at time of surgery) were used as subjects. Surgery was conducted in a Kopf stereotaxic apparatus under isoflurane anesthesia (5% induction, 2% maintenance). Upon recovery, rats were food restricted to 90% of their free-feeding body weight. Access to water and crinkle paper enrichment were provided ad libitum in the home cage. Rats were singly housed under a 12:12 light/dark cycle with lights off at 10:00 AM. All experiments were conducted in the dark/active cycle. For FSCV surgery, rats were implanted with a microdialysis guide cannula (BAS) aimed at the NAcc core [+1.3 anteroposterior (AP), +1.4 mediolateral (ML)] and a contralateral Ag/AgCl reference electrode. For optogenetics surgery, rats received Cre-dependent virus [AAV-EF1a-DIO-hChR2(H134R)-EYFP; UNC vector core] and optical ferrule cannulae (preassembled; Thor Labs) aimed at the VTA. Viral infusion coordinates were −5.2 and −6.0 AP, ±0.5 ML, and −7.4 and −8.4 dorsoventral (DV); VTA optical ferrule cannulae coordinates were −5.6 AP, +0.5 L, and −7.9 DV; NAcc optical ferrule cannulae coordinates were +1.3 AP, ±1.4 ML, and −6.5 DV. We also prepared a WT control group. Here, WT Long–Evans rats were transfected with virus and implanted with optical ferrule cannulae similarly to transgenic rats prepared for VTA stimulation. These WT animals also received identical optical stimulation of the VTA during behavior. Behavioral training commenced >1 wk following surgery; sessions that included optical stimulation commenced >4 wk following surgery. The University of Colorado Denver Institutional Animal Care and Use Committee approved all experiments and procedures in advance. Behavioral Economics Tasks. Following acquisition, animals were provided daily (7 d/wk) access to sucrose in behavioral economic tasks. In these tasks, sucrose is provided to rats across 10 increasing unit prices (i.e., response requirement/milligrams of sucrose). We developed two iterations of the task to assess the role of both the numerator and denominator of the unit price ratio. As illustrated in the left table of SI Appendix, Fig. S1, in the numerator assessment, the response requirement to receive a 45-mg sucrose pellet increases every 10 min with a 30-s time-out period occurring after reinforcement. Thus, the animal may receive a maximum of 900 mg of sucrose per epoch. In the denominator assessment (right table), the response requirement remains fixed at 1, while the milliliters of sucrose solution (300 mg/mL) delivered in each epoch decreases by manipulating pump duration, as previously described and validated during cocaine self-administration (47, 48). In this task, the time-out period decreases in each epoch (30, 10, 5, 3, 1.67, 0.97, 0.54, 0.3, 0.17, and 0.10 s) so that the animal may receive a maximum of 450 mg of sucrose per epoch. Importantly, in each task, unit prices are matched across epochs. Reward availability was indicated to the animal by illumination of a cue light and lever extension. After each reinforced response, the cue light dimmed and the lever retracted for the duration of an experimenter-imposed time-out. Fitting Demand Curves. We used the R statistical environment (49) for our analysis of demand profiles. We fitted individual demand profiles to the model equation (Eq. 1) using nonlinear least-squares regression, as implemented in the “minpack.lm” R package (50). Our code and data are freely available at https://gitlab.com/oleson/schelp-pnas. Multilevel Model and Bayesian Analysis. For the multilevel analysis, we constructed the following multilevel version of our model: ( Q mod ) i = Q min + ( Q max − Q min ) e − α [ t r t i ] C i , where the index i enumerates each observation (a single unit price–consumption pair) within an experimental block. The variable t r t i is the treatment associated with this observation. This model structure effectively associated all observations under a given treatment with a treatment level α. Each observation is treated as an independent realization of this model plus some normally distributed noise with an unknown variance σ 2 : ( Q obs ) i = ( Q mod ) i + ϵ i ϵ i ∼ N ( 0 , σ 2 ) , or, equivalently: ( Q obs ) i ∼ N ( ( Q mod ) i , σ 2 ) , where N ( ) represents a normal distribution. In the Bayesian inference procedure, the model parameters θ = { Q max , Q min , α [ t r t ] , σ 2 } are treated as random variables with unknown distributions. The goal is to infer these probability distributions such that they best explain the observed data. This is achieved by proposing a prior probability distribution P ( θ ) and applying Bayes’ rule to compute a posterior probability distribution that is conditioned on the observed data: π ( θ ) ∝ P ( θ ) P ( obs | θ ) , where P ( obs | θ ) is the probability of an observation given some θ. This probability is usually called the likelihood function. It is high when there is a good match between an observed response ( Q obs ) i and the model prediction ( Q mod ) i , or equivalently, when the error term ϵ i is small in magnitude. Thus, parameter values that predict a closer match with the observations are more heavily weighted in the posterior distributions. We used an efficient Hamiltonian Monte Carlo scheme implemented in the Stan modeling language to generate samples from the posterior distributions (51). FSCV. On sessions involving voltammetric recordings, glass-encased carbon fiber microelectrodes were introduced into the NAcc using micromanipulators and locked in place for behavioral testing. Dopamine was detected from fast-scan cyclic voltammograms collected at the carbon fiber electrode every 100 ms (initial waveform: −0.4 to 1.3 V, 400 V/s) (52). Principal-component regression was used as previously described to extract the dopamine component from the raw voltammetric data (53). Representative dopamine concentration traces, but not data used for quantification, were smoothed using the built-in Tarheel CV smoothing option (eight-point nearest-neighbor smoothing kernel). Due to concerns regarding the validity of using standardized calibration factors for dopamine assessments (54), we applied a recently developed computational model (55) designed to calculate calibration factors for individual electrodes using background current values observed during in vivo recordings. For additional detail, see SI Appendix, Fig. S13. When relevant, microelectrode placement was determined by performing electrolytic lesions of recording sites before perfusion (SI Appendix, Fig. S16A). Additional representative histology is also depicted in SI Appendix, Fig. S16 B–F. See SI Appendix, Fig. S16 legend, for histology methods. Optogenetic Stimulation. Intracranial light (473 nm) was delivered using a laser (opto-engine) under control of a custom arduino system. Stimulation parameters (10 pulses at 20 Hz, 0.5-s duration) were identical for all experiments. Laser output was adjusted according to the Stanford brain tissue light transmission calculator to produce a 1-mm cone irradiance of 1 mW/mm2 in brain tissue, with 15-mW output from optical ferrule cannulae tip. Stability of light retention over the course of experimentation was confirmed (SI Appendix, Fig. S15 and Table S16). Statistics and Data Analysis. All data analysis, except the behavioral economics component of the study, was performed with SigmaPlot (version 11). ANOVA and Bonferroni post hoc tests were used to assess for changes in dopamine concentration.

Acknowledgments We thank Drs. David Roberts, Joseph Cheer, and Lindsey Hamilton for helpful comments during the preparation of this manuscript. We also thank Scott Ng-Evans for technical support. Funding for the project was provided by National Science Foundation Grant IOS-1557755, NIH Grant R03DA038734, Boettcher Young Investigator Award and National Alliance for Research on Schizophrenia and Depression Young Investigator Award (to E.B.O.), and an institutional Undergraduate Research Opportunities Program (to S.A.S.).

Footnotes Author contributions: E.B.O. designed research; S.A.S., K.J.P., D.R.R., D.M.G., G.K., and E.B.O. performed research; S.A.S., D.M.G., R.D., and E.B.O. analyzed data; R.D. and E.B.O. wrote the paper; and R.D. developed computer programs used for data analysis.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. W.S. is a guest editor invited by the Editorial Board.

See Commentary on page 13597.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1706969114/-/DCSupplemental.