Induction of ethanol dependence disrupts decision-making

To investigate whether prior drug dependence results in a long-lasting disruption to decision-making processes, we utilized a well-validated CIE model to induce ethanol dependence in mice32,33,34,35,36,37. Mice were exposed to periods of CIE or air (Air) vapor and subsequent withdrawal over a period of four weeks (Fig. 1a, three vapor cohorts, Air n = 15, CIE n = 19). Mice were placed in inhalation chambers and exposed to ethanol or air vapor for 16 h per day, 4 days per week. We did not give a loading dose of ethanol or a pretreatment of pyrazole33 to avoid confounding effects of stress that can bias reliance on habitual control42, as well as to avoid pyrazole’s broad effects on neural activity including actions at the N-methyl-d-aspartate (NMDA) receptor43. Even without pretreatment, our procedure produced mean blood ethanol concentrations of 34.7 ± 2.0 mM, similar to what has previously been reported34,44. After 72 h of the last CIE exposure, mice were food restricted to achieve 90% of their baseline weight for 2 days prior to instrumental lever press training for food pellets or sucrose.

Fig. 1 Chronic ethanol exposure and repeated withdrawal biases towards habitual control over actions. a Experimental timeline of CIE and the following operant training and subsequent outcome devaluation testing (DV). b Mice (3 cohorts, Air n = 15, CIE n = 19) are trained to press the same lever (left or right) for the same food outcome (food pellet or sucrose) in two distinct contexts under RI or RR schedules of reinforcement. c, d Response rate of lever pressing during acquisition under RI (c) or RR (d) schedules. e Schematic of the outcome devaluation procedures. On the devalued day, mice receive 1 h free access to the outcome previously produced by lever pressing, immediately followed by a 5 min extinction test in each context. To control for effects of general satiety on responding, on the valued day mice receive 1 h free access to the remaining outcome, immediately followed by a 5 min extinction test in each context. f Normalized lever presses showing the distribution of lever pressing between the valued and devalued day in each training context. g Devaluation index (see Methods) for each group in the previously trained RI and RR contexts. Data are represented as mean ± SEM. ****p < 0.001, *p < 0.05 reflect one-sample t tests against 0.5 and 0, respectively Full size image

Decision-making recruits parallel action strategies: goal-directed actions and habitual actions24. If ethanol dependence does induce long-lasting changes to decision-making processes, it may be apparent in the disrupted use of goal-directed actions or a bias towards reliance on habits. We utilized an instrumental task we recently developed, where on the same day, the same mouse will shift between goal-directed and habitual control over food responding19,21. In brief, mice were trained in two distinct contexts to press a lever in the same location for the same food outcome (food pellet or 20% sucrose). To predispose the use of habitual vs. goal-directed action control, mice were trained to lever press under random interval (RI) and random ratio (RR) schedules of reinforcement, respectively (Fig. 1b)45,46,47. Trained under these schedules, Air mice and CIE mice acquired lever press behavior for food (Fig. 1c, d; Supplementary Fig. 1). Although visually, it appeared that CIE exposure increased response rate, a three-way repeated measures ANOVA (context × CIE exposure × training day) on response rate during training showed a main effect of training day (F (8,112) = 30.61, p < 0.001), but no main effect of CIE exposure or interaction (Fs’<1.31). This suggests that while CIE exposure may have led to slightly increased response rates, all mice increased lever pressing in a similar manner across training.

To assess whether an action is goal-directed or habitual, we examined the sensitivity of lever pressing to changes in expected outcome value using outcome devaluation procedures. After 15 to 21 days following the last vapor exposure and lever press training, we subjected Air and CIE mice to sensory-specific satiation of the food outcome (food pellet or sucrose) previously produced by lever pressing (devalued state), or a control outcome (the remaining outcome) mice had previously experienced in their home-cage (valued state) (Fig. 1e). Each prefeeding period was followed by a brief 5-min test in each of the trained contexts, where we measured the number of non-reinforced lever presses made. A significant reduction in lever pressing in the devalued state compared to valued state is indicative of goal-directed control, while similar pressing between states reflects habitual control48.

Air mice readily shifted between using a goal-directed strategy in the previously RR trained context and control by more habitual processes in the previously RI trained context, while CIE showed a noted lack of goal-directed control in RI and RR training contexts (Fig. 1f-g; Supplementary Fig. 1e). Since differences in response rates during acquisition and testing were observed within a group, as well as between Air and CIE exposure groups (Supplementary Fig. 1 and 3), lever presses were normalized to total presses made in each context during testing. This allows us to examine CIE effects on decision-making in the absence of differences in response rates. A three-way ANOVA on normalized lever pressing showed a significant three-way interaction (devaluation state × context × CIE exposure: F (1, 32) = 10.83, p = 0.002), a significant two-way interaction of devaluation state × context (F (1,32) = 10.57, p = 0.003) and a main effect of devaluation state (F (1,32) = 5.20, p = 0.03), but no other two-way interactions or main effects (Fs < 2.00). This suggests that CIE mice and Air mice show different sensitivity to outcome devaluation testing across RI and RR training contexts.

To examine whether Air and CIE mice differentially distributed lever pressing across valued and devalued days, we performed one-sample t tests performed against 0.5 (equal lever pressing on valued and devalued days) on normalized lever press data. Air mice differentially distributed lever presses between valued and devalued days in the RR training context (t 14 = 5.95, p < 0.0001) but not in the RI context (t 14 = 0.86, p = 0.40) (Fig. 1e). In striking contrast to Air mice, CIE mice did not differentially distribute lever pressing between valuation states in either RI or RR trained contexts (ts < 0.60) (Fig. 1f). These findings confirm that Air mice reduced responding following outcome devaluation only in the RR context but not RI context, and that CIE exposure resulted in lever pressing insensitive to outcome devaluation in either training context.

We then used a devaluation index to assess whether an individual mouse shifted the degree to which lever pressing was goal-directed between contexts (Methods). We found that CIE exposure disrupts the within-subject shift in goal-directed control. We performed repeated measures ANOVA (context × CIE exposure) and found a significant interaction (F (1,32) = 10.83, p = 0.002) and a main effect of context (F (1,32) = 10.57, p = 0.003), but no effect of CIE exposure (F < 1.95). Although Air mice showed an increase in goal-directedness in the RR context compared to the RI context (Bonferroni-corrected p < 0.001), CIE mice showed similar levels of goal directedness between contexts (p > 0.1) (Fig. 1g). One sample t tests performed against a hypothetical 0 devaluation index (equal pressing between valued and devalued states) confirmed significant goal-directed control in Air mice only in the RR context (t 14 = 5.95, p < 0.001), but not in the RI context (RI t = 0.86) or in CIE mice in either RI and RR contexts (ts < 0.60).

The lack of goal-directedness in CIE mice cannot be attributed to dependence-induced changes in outcome palatability or sensitivity to devaluation. A subset of Air mice and CIE mice underwent a post-test free feeding assay immediately following outcome devaluation testing. Air and CIE mice consumed similar amounts in prefeeding devaluation procedures as well as in post-test free feeding procedures (Supplementary Fig. 1d, g). Further, correlational analyses performed between the response rate during training and responding during testing suggest that the increased response rate observed in CIE mice did not contribute to the differences in the magnitude of subsequent goal-directed control (Supplementary Fig. 1h). Instead, the present findings suggest that prior CIE exposure results in a long-lasting deficit in decision-making processes, as reflected in the disruption of goal-directed control examined ~3 weeks after the last exposure to ethanol.

CIE exposure selectively alters orbitostriatal circuits

CIE exposure resulted in long-lasting changes to decision-making processes, suggesting that ethanol dependence-induced changes in neural circuits controlling goal-directed and/or habitual actions. For example, CIE exposure may disrupt neural circuits supporting goal-directed control, or enhance control of neural circuits modulating habits. Previous research has shown long-term changes in the cortical activation of abstinent alcoholics30. In particular, hypoactivation of OFC circuits correlated with impaired reward choice behavior in abstinence. Abstinent alcoholics were found to have an immediate reward bias, with BOLD signal in lateral OFC correlated to delayed reward choice27. More recently, OFC activity has been shown to support goal-directed action control across species19,21,38,39,40,49, and is an important regulator of the shift between goal-directed and habitual control. We recently showed that increases in excitatory transmission at OFC terminals in dorsal striatum (OFC-DMS) drive goal-directed control, with habitual control emerging from the attenuation of OFC-DMS transmission21.

Given the importance of OFC-DMS function for goal-directed control over actions, we hypothesized that ethanol dependence alters OFC function through changes in synaptic transmission onto DMS. Mice were exposed to CIE procedures and ex vivo whole-cell electrophysiological recordings were conducted 3–21 days after the last vapor exposure, corresponding to the time frame of acquisition and devaluation testing (Fig. 2a). First, we examined whether dependence alters intrinsic properties of OFC projection neurons. We observed a decrease in excitability of OFC projection neurons following CIE procedures (repeated measures ANOVA: interaction (CIE exposure × current) = F (10, 170) = 5.23, p < 0.0001; main effect of current = F (10, 170) = 27.82, p < 0.0001; main effect of CIE exposure = F (1, 17) = 5.81, p = 0.03) (Fig. 2b, c, Supplementary Table 1; 3 vapor cohorts, Air n = 8, CIE n = 11) that was present across the 3–21-day range of testing (Supplementary Fig. 2b). In addition, we found that resting membrane potentials were hyperpolarized in CIE-exposed mice compared to Air controls (Supplementary Table 1). This suggests that ethanol dependence induces a long-lasting reduction in the excitability of OFC projection neurons that is present even after a significant period of abstinence.

Fig. 2 CIE induces long-lasting disruptions to orbitostriatal circuits. a Experimental timeline used for electrophysiological recordings. Mice were given viral injections and allowed 2–4 weeks to recover before exposure to the CIE procedure. b Schematic of OFC recording site. c The number of spikes plotted against current injected (left) and representative traces of action potential firing at 200 pA (right) (3 cohorts, Air n = 8, CIE n = 11). d Schematic of OFC injection site and DMS recording site for optically induced currents. e Cre-dependent ChR2-YFP expression at the OFC injection site (left) and OFC terminals in the DMS (right). f Paired pulse ratio (PPR) of optically induced currents of OFC input to D1 SPNs. Scale bars represent 25 ms (horizontal) and 50 pA (vertical) (3 cohorts, Air n = 7, CIE n = 15). g Representative current traces of asynchronous release to D1 SPNS recorded in 2 mM Sr2+. Scale bars represent 250 ms (horizontal) and 50 pA (vertical). h Average frequency of asynchronous release to D1 SPNs (Air n = 8, CIE n = 12). i Average amplitude of asynchronous release to D1 SPNs. j PPR of optically induced currents of OFC input to D2 SPNs (3 cohorts, Air n = 7, CIE n = 9). k Representative current traces of asynchronous release to D2 SPNs recorded in 2 mM Sr2+. l Average frequency of asynchronous release to D2 SPNs (Air n = 7, CIE n = 7). m Average amplitude of asynchronous release to D2 SPNs. Data points and bar graphs represent the average ± SEM. ****p < 0.0001, **p < 0.01, *p < 0.05 Full size image

We next examined whether ethanol dependence would result in changes to OFC-DMS transmission. OFC projection neurons synapse onto spiny projection neurons (SPNs) of both major basal ganglia output pathways in the DMS in similar proportions50; SPNs of the direct pathway that express the dopamine-type 1 receptor (D1 SPNs) and SPNs of the indirect pathway that express dopamine-type 2 receptor (D2 SPNs)51. Direct and indirect basal ganglia pathways are thought to coordinate activity to support action selection and performance52. We hypothesized that OFC-DMS transmission onto direct and indirect pathways may be altered by ethanol dependence. To investigate OFC-DMS circuits in a projection and cell-type-specific manner, we utilized a viral approach in transgenic mice to selectively examine OFC transmission onto D1 or D2 SPNs. To target direct and indirect pathway SPNs, we utilized multiple transgenic lines to ensure the reproducibility of our findings. B6.FVB(Cg)-Tg(Drd1-cre)EY266Gsat/Mmucd (D1-Cre) and B6.Cg-Tg(Drd1a-tdTomato)6Calak/J (D1-tdTomato) transgenic mice were used to target the direct pathway (D1 SPNs), while B6.FVB(Cg)-Tg(Adora2a-cre)KG139Gsat/Mmucd (A2A-Cre) and non-labeled SPNs from D1-tdTomato transgenic lines were used to label the indirect pathway (D2 SPNs). No differences were observed between transgenic lines so results were combined. All mice were injected with AAV5-CamKIIa-GFP-Cre and a Cre-dependent channel rhodopsin (AAV5-Ef1a-DIO-ChR2-eYFP) (UNC viral vector core) in the OFC to limit channel rhodopsin expression to CamKIIa expressing neurons (Fig. 2d, e)19,21. D1-Cre and A2a-Cre mice were also injected with AAV5-hSyn-DIO-mCherry targeted to the DMS to label SPN populations. After 1 to 3 weeks of the surgery, mice underwent CIE procedures (Fig. 2a). Following acute withdrawal after the last vapor exposure (3–21 days), whole-cell patch-clamp recordings of identified D1 or D2 SPNs were made and transmission in response to light activation of OFC terminals was examined.

We first used paired pulse ratio (PPR) to examine whether CIE procedures altered the probability of neurotransmitter release from OFC terminals onto D1 SPNs. In Air mice, we found a high probability of neurotransmitter release at the OFC input onto D1 SPNs, as indicated by a paired pulse depression (PPD) (Fig. 2f; Supplementary Fig. 2c). In stark contrast, recordings made from CIE mice showed significant paired pulse facilitation (PPF), revealing a decrease in probability of neurotransmitter release from OFC terminals onto D1 SPNs. A direct comparison between Air and CIE mice using a two-way ANOVA (CIE exposure × interstimulus interval (ISI)) showed a significant interaction (F (4, 68) = 3.53, p = 0.01), and main effect of CIE exposure (F (1, 17) = 16.96, p = 0.0007) (Bonferroni-corrected, ****p < 0.0001 vs. Air; **p < 0.0001 vs. Air; *p < 0.05 vs. Air) (Fig. 2f, three vapor cohorts, Air n = 7, CIE n = 15). To further investigate the observed decrease in neurotransmitter release at OFC-DMS terminals, we replaced Ca2+ with strontium (Sr2+) in the recording solution and optically stimulated the OFC input. The use of Sr2+ in the recording solution has previously been used to examine asynchronous release in an input specific manner, with a decrease in release reflecting a decrease in release probability53,54,55. In CIE mice, we observed a significant decrease in frequency of asynchronous release onto D1 SPNs compared to that observed in Air mice (Student’s t test, t 18 = 3.13, p < 0.01) with no change in amplitude (Student’s t test, t 18 = 0.20, p = 0.84) (Fig. 2g–i, Air, n = 8, CIE, n = 12). This selective decrease in asynchronous release onto the direct pathway was also apparent across the full range of testing (Supplementary Fig. 2d), suggesting a long-lasting decrease in OFC transmission selectively onto the direct pathway.

We next examined whether the induction of ethanol dependence would alter OFC input onto the indirect pathway. Similar to OFC transmission on to D1 SPNs in Air mice, we observed PPD of OFC transmission onto D2 SPNs in Air control mice. However, the PPD of OFC transmission onto D2 SPNs was still present in CIE mice (two-way ANOVA of CIE exposure × ISI: interaction and main effects Fs’ < 1.0) (Fig. 2j; three vapor cohorts, Air n = 7, CIE n = 9). When we examined asynchronous release at OFC terminals onto D2 SPNs, we found no differences between Air and CIE mice in the presence of Sr2+ (frequency: Student’s t test, p = 0.96; amplitude: Student’s t test, p = 0.26) (Fig. 2k–m, Air, n = 7, CIE, n = 7). Together these results show that prior ethanol dependence-induced long-lasting decreases in OFC neurotransmitter release into DMS in a cell-type-specific manner, selectively affecting transmission onto the direct but not indirect output pathway of the basal ganglia.

In addition, the decrease in neurotransmitter release was at least partially selective to the OFC input. When we used electrical stimulation to examine all excitatory input onto either D1 or D2 SPNs (Fig. 3a, b), we found no differences in PPR (no interactions (ps > 0.05), D1 SPNs main effect of ISI: F (4, 40) = 37.69, p < 0.0001; D2 SPNs main effect of ISI: F (4, 40) = 16.84, p < 0.0001) (Fig. 3c, d). Furthermore, we found no differences between Air and CIE mice in spontaneous EPSC (sEPSC) frequency (ps > 0.05) or amplitude in either D1 SPNs (Fig. 3e–g) or D2 SPNs (Fig. 3h–j) (ps > 0.05), again suggesting a disruption at least partially selective to OFC-DMS input. Together, our findings suggest that chronic ethanol dependence induces long-lasting decreases in the excitability and output of a pathway known to control goal-directed actions19,21, onto a pathway known to support action selection and performance16,52. Intriguingly, these changes are mediated through selective changes in OFC transmission onto the direct pathway.

Fig. 3 CIE does not alter all excitatory input into striatal circuits. a Experimental timeline for cohorts of mice used for electrophysiological recordings. Mice were injected with AAV ChR2 in the OFC and allowed 2–4 weeks to recover before exposure to the CIE procedure. Data collected in which currents were evoked electrically were done in the same brain slices used for optically evoked currents. b Schematic of DMS recording site and placement of stimulating electrode within striatum. c, d Paired pulse ratio (PPR) of electrically induced currents onto D1 SPNs (Air n = 6, CIE n = 6) (c) and D2 SPNS (Air n = 7, CIE n = 11) (d) in Air or CIE-exposed mice. Scale bars represent 25 ms (horizontal) and 50 pA (vertical). e Representative traces of spontaneous EPSCs (sEPSCs) in D1 SPNs in Air and CIE mice. Scale bars represent 1 s (horizontal) and 20 pA (vertical). f, d Frequency (f) and amplitude (g) of sEPSCs in D1 SPNs from Air and CIE mice. h Representative traces of spontaneous EPSCs (sEPSCs) in D2 SPNs in Air and CIE mice. i, j Frequency (i) and amplitude (j) of sEPSCs in D2 SPNs from Air and CIE mice. Data points and bar graphs represent the average ± SEM Full size image

OFC activation restores goal-directed control following CIE

While our ex vivo results suggest that the deficits in goal-directed behavior may be in part due to reduced OFC excitability and synaptic transmission into DMS, we do not know whether the observed changes directly biased decision-making. To examine this, we took a chemogenetic approach56 to selectively increase OFC projection neuron activity in CIE mice during outcome devaluation testing (Fig. 4a, b). We injected an activating DREADD (AAV5-hSyn-DIO-hM3Dq-mCherry) into the OFC of B6.129S2-Emx1tm1(cre)Krj/J (Emx1-Cre) mice, thereby restricting expression to OFC projection neurons. A subset of Air and CIE mice were injected with AAV5-hSyn-DIO-mCherry (DIO-mCherry) to control for any effects of surgery, AAV infection, and CNO administration. Post-recovery, mice were subjected to CIE exposure procedures, followed by instrumental training and outcome devaluation testing (Fig. 4a, 3 vapor cohorts, groups: Air n = 16, CIE control n = 14, CIE H3 n = 19). To confirm the function of our manipulation, we conducted whole-cell current clamp recordings in identified OFC projection neurons expressing mCherry from infusions of AAV5-hSyn-DIO-hM3Dq-mCherry (Fig. 4b). Bath application of CNO (10 µM) resulted in a significant increase in excitability (two-way repeated measures ANOVA (current × CNO), interaction: F (14, 70) = 7.52, p < 0.0001; main effect of CNO: F (1, 5) = 13.95, p = 0.01) (Fig. 4c, n = 6).

Fig. 4 Activation of OFC neurons restores control by goal-directed processes. a Experimental outline. Mice were injected with an activating DREADD (hM3Dq) in the OFC. After recovery, mice underwent CIE procedures followed by operant training and outcome devaluation testing. CNO was administered to hM3Dq expressing CIE-exposed mice (CIE H3), control CIE-exposed mice (CIE), and Air-exposed mice (Air) 30 min prior to prefeeding on both valued and devalued testing days. b Schematic of OFC injection site with hM3D-mCherry (left) and subsequent expression of mCherry in the OFC (middle). Schematic of maximum (pink) and minimum (red) boundaries of viral spread in the OFC (right). c The number of spikes plotted against the current injected (left). Representative traces of action potential firing at 200 pA (right) (n = 6 cells). d Normalized lever presses for each group (three cohorts, Air n = 16, CIE control n = 14, CIE H3 n = 19) showing the distribution of lever presses between valued and devalued days in random interval (RI) and random ratio (RR) trained contexts. e Devaluation index for each group in the previously trained RI and RR contexts. Data are represented as mean ± SEM. **p < 0.01 and ***p < 0.001 reflect one-sample t tests against 0.5 and *p < 0.05 and #p = 0.09 reflect one-sample t tests against 0 Full size image

Following CIE procedures, all mice underwent instrumental lever press training for food (Supplementary Fig. 3). Prior to outcome devaluation testing, all mice were given pretreatments of saline or CNO (1 mg/kg, 10 ml/kg). We used virus and drug treatment controls in each Air and CIE control group and did not see differences between controls injected with saline or CNO; therefore, we collapsed across controls for ease of presentation. While Air mice showed more goal directedness in the RR vs. RI context (albeit to lesser degree), and CIE mice showed little sensitivity to outcome devaluation and were habitual in both contexts, CIE H3 mice showed goal-directed control in both RI and RR contexts. This was supported by a three-way repeated measures ANOVA performed on normalized lever presses (devaluation state × context × group) that did not show a significant three-way interaction (F (2,46) = 1.29, p = 0.28), but did show a significant two-way interaction between context and group (F (2,46) > 15.0, p < 0.001), suggesting that Air, CIE, and CIE H3 groups showed different patterns of lever pressing in RI and RR training contexts (Fig. 4d, Supplementary Fig. 3f). A main effect of context (F (1,46) > 15, p < 0.001) and devaluation state (F (1,46) = 14.24, p < 0.001) was also observed, showing that on average, lever pressing differed between RI and RR training contexts and as well as between valued and devalued states.

The finding of different patterns of lever pressing between groups was further supported by one-sample t tests against 0.5 conducted on normalized lever pressing. While Air mice differentially distributed lever pressing only in the RR context (albeit slightly) (t 15 = 2.24, p < 0.05) and not in the RI context (t 15 = 1.12, p = 0.28), CIE mice did not differentially distribute lever pressing between valuation states in either context (one-sample t test, RI: t 13 = 0.96; RR: t 13 = 0.84). CNO administration to CIE mice expressing the activating DREADD in OFC projection neurons (CIE H3) restored goal-directed control in the RR context (one-sample t test, t 18 = 3.90, p < 0.01) and resulted in goal-directed control in the RI context (one-sample t test, t 18 = 4.85, p < 0.001) (Fig. 4d).