Only a small number of pharmacological treatment options are approved for the treatment of alcohol use disorder, all of which are limited by inconsistent and/or poor efficacy. This lack of effective therapies may explain why kratom has also been reportedly used in the self‐medication of symptoms associated with alcohol withdrawal (Havemann‐Reinecke, 2011 ; McWhirter & Morris, 2010 ; Singh, Muller, & Vicknasingam, 2014 ; Suhaimi et al., 2016 ). Previous research has shown that G protein‐biased δOP agonists decrease voluntary alcohol intake in C57Bl/6 male mice, while δOP agonists that strongly recruit β‐arrestin 2 increase voluntary alcohol intake (Chiang, Sansuk, & van Rijn, 2016 ; Nielsen et al., 2012 ; Robins, Chiang, Mores, Alongkronrusmee, & van Rijn, 2018 ). From these findings, we hypothesized that the reported utility of kratom in reducing alcohol intake stems from kratom's constituent alkaloids displaying G protein bias at δOP. To address our hypothesis, we characterized the μOP, δOP and κOP pharmacology of two separate kratom extracts, four isolated major kratom alkaloids (mitragynine, speciogynine, paynantheine and 7‐hydroxymitragynine, Figure 1 c), and three synthetic opioids N ‐cycloheptyl‐1‐phenethyl‐4‐( N ‐phenylpropionamido)piperidine‐4‐carboxamide, N ‐cyclopropyl‐1‐phenethyl‐4‐( N ‐phenylpropionamido)piperidine‐4‐carboxamide, N ‐(tert‐butyl)‐1‐phenethyl‐4‐( N ‐phenylpropionamido)piperidine‐4‐carboxamide (MP102, MP103 and MP105, respectively) that have G protein‐biased pharmacology similar to the kratom alkaloids. These extracts and drugs were also assessed for their ability to modulate alcohol intake, affect general locomotive behaviour,and for their rewarding properties.

Interest in kratom has steadily increased over the last 5‐year period. Five‐year Google trends analysis from June 2014 to June 2019 (performed June 8, 2019) comparing morphine (blue), kratom (red), heroin (yellow), and fentanyl (green). Note that Google searches for kratom outnumbered those for heroin starting November 2017. The spike in heroin searches coincide with the overdose of Demi Lovato. The first spike in fentanyl searches coincide with the death of Prince and the second spike with FDA approval of Dsuvia™ ( sufentanil ). The initial increase in kratom searches in the fall of 2016 coincided with the DEA's decision to defer their scheduling kratom (a). Animated depiction of a kratom leaf (b). Chemical structures of characterized kratom alkaloids (c)

Interest in and use of the psychoactive plant Mitragyna speciosa (kratom) has risen dramatically across North America and Europe over the last 5 years (Singh, Narayanan, & Vicknasingam, 2016 ), (Figure 1 a,b). Historically, kratom has been used in its indigenous Southeast Asian regions to relieve pain, diarrhoea and cough or to provide stimulation (Prozialeck, Jivan, & Andurkar, 2012 ). The currently inflated interest in kratom in the United States coincides with changes in opioid prescribing guidelines by the Center for Disease Control and Prevention in 2016 (Renthal, 2016 ) and the rise in heroin adulterated with fentanyl ‐like opioids, leading to a spike in fatal and non‐fatal overdoses (Dowell, Noonan, & Houry, 2017 ; Gostin, Hodge, & Noe, 2017 ). Kratom contains more than 40 alkaloids with varying affinity and activity at opioid receptors (Adkins, Boyer, & McCurdy, 2011 ; Brown, Lund, & Murch, 2017 ; Hassan et al., 2013 ; Kruegel et al., 2016 ; Takayama, 2004 ) and is commonly used for the self‐medication of opioid dependence and withdrawal, the management of chronic pain and mood disorders or as substitute for heroin or prescription opioids (Grundmann, 2017 ; Singh et al., 2016 ; Smith & Lawson, 2017 ). Despite these perceived benefits, increasing rates of kratom use have led to concomitant increases in reports of adverse effects following consumption, although to date, no fatal overdoses have been attributed to kratom use alone (Cinosi et al., 2015 ; Kruegel & Grundmann, 2017 ). While the Drug Enforcement Administration recently decided to withhold its decision on classifying kratom as a Schedule I drug (Griffin & Webb, 2018 ; Grundmann, Brown, Henningfield, Swogger, & Walsh, 2018 ), reservations about the safety of kratom remain, leading to increased scrutiny of its current legal status in the United States (Henningfield, Fant, & Wang, 2018 ; Prozialeck, 2016 ).

2 METHODS

2.1 Materials Speciogynine, paynantheine, 7‐hydroxymitragynine, and mitragynine were isolated (purity; >95%) by column chromatography and provided by Dr Majumdar. MP102, MP103, and MP105 were synthetically derived (purity; >97%) and provided by Dr Majumdar. Morphine sulfate pentahydrate, leu‐enkephalin, forskolin, hydrochloric acid, sodium sulfate, dichloromethane, ammonia, hexanes, and ethyl alcohol (200 proof) were purchased from Sigma‐Aldrich (St. Louis, MO, USA). Naltrindole hydrochloride, (2S)‐2‐[[2‐[[(2R)‐2‐[[(2S)‐2‐Amino‐3‐(4‐hydroxyphenyl)propanoyl]amino]propanoyl]amino]acetyl]‐methylamino]‐N‐(2‐hydroxyethyl)‐3‐phenylpropanamide (DAMGO), and 2‐(3,4‐dichlorophenyl)‐N‐methyl‐N‐[(1R,2R)‐2‐pyrrolidin‐1‐ylcyclohexyl]acetamide (U50,488) were purchased from Tocris Bioscience (Bio‐techne Corporation, Minneapolis, MN, USA). (3‐Methoxythiophen‐2‐yl)methyl]((2‐[(9R)‐9‐(pyridin‐2‐yl)‐6‐oxaspiro‐[4.5]decan‐9‐yl]ethyl))amine (TRV130, oliceridine) was purchased from AdooQ Bioscience (Irvine, CA, USA). For animal drinking assays, pure ethyl alcohol was diluted to 10% or 20% alcohol in reverse osmosis water.

2.2 Kratom extract #1 An alkaloid extract was obtained from Maeng Da Micro Powder (MoonKratom, Austin, TX, USA) as described previously by Orio, Alexandru, Cravotto, Mantegna and Barge (2012). In brief, as shown in Figure S1A, extraction was performed by treating kratom powder in 95% ethanol at 50°C for 4 hr followed by removal of the remaining organic material by vacuum filtration. Solvent was then removed under reduced pressure and the crude extract resuspended in dilute aqueous hydrochloric acid (pH = 3) and washed with hexane. The aqueous solution was then basified (pH = 9) with 0.1‐M aqueous ammonia and the freebase alkaloid fraction extracted with dichloromethane. The alkaloid containing fraction was dried over anhydrous sodium sulfate and filtered, followed by removal of solvent under reduced pressure and further drying under high vacuum to obtain our kratom alkaloid extract as a crystalline, light brown solid. Extract composition was assessed on a 6550‐quadropole time of flight (Aligent, Santa Clara, CA, USA; scan 105–100 amu) using a Zorbax Extend‐C18 column (Aligent) held at 30°C and a 0.3 ml·min−1 of flow rate.

2.3 Kratom extract #2 Mitragynine was extracted from the powdered leaves by following our previously reported methods (Varadi et al., 2016; Figure S1B). “Red Indonesian Micro Powder” was purchased from MoonKratom. The kratom powder (500 g) was heated to 75°C to reflux in methanol 700 ml for 40 min. The suspension was filtered, and the methanolic extraction process was repeated (3 × 500 ml). The solvent of combined methanolic extract was removed under reduced pressure, and the content was dried using high vacuum. The dry residue was resuspended in 20% acetic acid solution (1 L; pH = 4) and washed with petroleum ether (4 × 500 ml). The aqueous layer was then cooled on ice bath and basified (pH ~9) slowly with aqueous sodium hyrdoxide solution (3.5 M; ~1 L). Alkaloids were extracted in dichloromethane (4 × 400 ml) from the aqueous layer. The combined dichloromethane fractions were washed with brine (300 ml) and dried over anhydrous sodium sulfate and filtered. The solvent was removed under reduced pressure, and the residue was dried under high vacuum to obtain kratom extract. The kratom extract was subjected to column chromatography (gradient: 0–40% ethylacetate in hexanes) to isolate mitragynine (Yield; 4.9 g), and smaller quantities of paynantheine (0.58 g) and speciogynine (0.35 g). For cellular assays, kratom extracts were dissolved to a concentration of 10 mM in 100% DMSO. The calculated concentration was estimated by assigning the kratom extract an estimated molecular mass of 400 g·mol−1, which is the average size of kratom alkaloids.

2.4 Cell culture and biased signalling assays cAMP inhibition and β‐arrestin 2 recruitment assays were performed as previously described (Chiang et al., 2016). In brief, for cAMP inhibition assays, HEK 293 (RRID:CVCL_0045, Life Technologies, Grand Island, NY, USA) cells were transiently transfected in a 1:3 ratio with FLAG‐mouse δOP, HA‐mouse μOP, or FLAG‐mouse κOP and pGloSensor22F‐cAMP plasmids (Promega, Madison, WI, USA) using Xtremegene9 (Sigma). Two days post‐transfection, cells (20,000 cells per well, 7.5 μl) were seeded in low‐volume Greiner 384‐well plates (#82051‐458, VWR, Batavia, IL, USA) and incubated with GloSensor reagent (Promega, 7.5 μl, 2% final concentration) for 90 min at room temperature. Cells were stimulated with 5‐μl drug solution for 20 min at room temperature prior to stimulation with 5‐μl forskolin (final concentration 30 μM) for an additional 15 min at room temperature. For β‐arrestin recruitment assays, CHO‐K1‐human μOP PathHunter β‐arrestin 2 cells (RRID:CVCL_KY70) and CHO‐K1‐human δOP PathHunter β‐arrestin 2 cells (RRID:CVCL_KY68) or U‐2 osteosarcoma (U2OS)‐human κOP PathHunter β‐arrestin 2 cells (RRID:CVCL_LA97, DiscoverX, Fremont, CA, USA) were plated (2,500 cells per well, 10 μl) 1 day prior to stimulation with 2.5‐μl drug solution for 90 min at 37°C/5%CO 2, after which cells were incubated with 6‐μl cell PathHunter assay buffer (DiscoverX) for 60 min at room temperature as per the manufacturer's protocol. Luminescence for each of these assays was measured using a FlexStation3 plate reader (Molecular Devices, Sunnyvale, CA, USA).

2.5 Calculation of bias factor In order to determine ligand bias, we followed the operational model equation in Graphpad Prism 8 (RRID:SCR_002798, GraphPad Software, La Jolla, CA) to calculate Log R (τ/KA; Table S1), ΔLogR, and ΔΔLogR as previously described (van der Westhuizen, Breton, Christopoulos, & Bouvier, 2014). Subsequently, bias factors (10ΔΔLogR) were calculated using DAMGO, leu‐enkephalin, and U50,488 as reference compounds for μOP, δOP, and κOP, respectively. All three reference compounds were more potent in the cAMP (G protein) assay than in the β‐arrestin 2 recruitment assay and thus were not unbiased but G protein‐biased to begin with. A bias factor >1 meant that the agonist was more G protein‐biased than the reference compound; a bias factor <1 meant that the agonist was less G protein‐biased than the reference compound. Bias factors for compounds with <30% efficacy for β‐arrestin 2 recruitment could not reliably be calculated and are listed as undeterminable (Table 1), which indicates that these agonists can be considered to be efficacy dominant for G protein signalling (Kenakin, 2015). Table 1. Pharmacological characterization of kratom alkaloids and synthetic G protein‐biased opioids at the μ, δ and κ opioid receptor cAMP β‐arrestin 2 Compounds pIC 50 α pEC 50 α Bias factor μOR DAMGO 8.4 ± 0.1 (23) 100 6.7 ± 0.1 (22) 100 1 Kratom #1 6.0 ± 0.2 (7) 71 ± 7 ND (3) ND UD Kratom #2 5.7 ± 0.3 (3) 100 ± 7 ND (3) ND UD Morphine 8.5 ± 0.2 (3) 95 ± 4 6.8 ± 0.1 (3) 26 ± 2 UD Mitragynine 6.3 ± 0.2 (8) 75 ± 6 ND (3) ND UD 7‐OH‐mitragynine 7.8 ± 0.1 (5) 84 ± 3 ND (3) ND UD Speciogynine 5.5 ± 0.1 (5) 87 ± 6 ND (3) ND UD Paynantheine 5.4 ± 0.1 (5) 100 ± 0 ND (3) ND UD TRV130 7.9 ± 0.2 (7) 86 ± 3 ND (4) ND UD MP102 5.4 ± 0.2 (6) 88 ± 4 5.2 ± 0. 1(4) 16 ± 5 UD MP103 6.5 ± 0.2 (5) 90 ± 4 6.3 ± 0.2 (7) 63 ± 7 0.03 MP105 6.7 ± 0.4 (5) 87 ± 6 6.6 ± 0.2(6) 54 ± 5 0.02 δOR Leu‐enkephalin 8.5 ± 0.1 (34) 100 8.0 ± 0.1 (29) 100 1 Kratom #1 4.4 ± 0.3 (6) 30 ± 20 ND (4) ND UD Kratom #2 5.8 ± 0.5 (4) 72 ± 13 ND (3) ND UD Morphine 6.1 ± 0.2 (5) 82 ± 8 ND (3) ND UD Mitragynine 4.8 ± 0.2 (5) 88 ± 8 ND (3) ND UD 7‐OH‐mitragynine 5.7 ± 0.2 (8) 80 ± 8 6.4 ± 0.3 (3) 14 ± 1 UD Speciogynine 5.0 ± 0.3 (5) 94 ± 4 ND (3) ND UD Paynantheine 5.6 ± 0.2 (4) 64 ± 13 ND (3) ND UD TRV130 5.6 ± 0.4 (5) 48 ± 12 ND (4) ND UD MP102 6.4 ± 0.1 (4) 77 ± 7 6.7 ± 0.4 (5) 20 ± 6 UD MP103 5.5 ± 0.2 (4) 100 ± 1 5.8 ± 0.1 (3) 35 ± 2 3.8 MP105 5.4 ± 0.3 (4) 94 ± 5 6.2 ± 0.1 (3) 35 ± 4 1.4 κOR U50,488 8.9 ± 0.1 (18) 100 7.3 ± 0.2 (10) 100 1 Kratom #1 6.8 ± 0.4 (8) 41 ± 8 ND (6) ND UD Kratom #2 7.0 ± 0.2 (4) 93 ± 2 6.0 ± 0.2 (5) 12 ± 2 UD Morphine 7.3 ± 0.2 (5) 89 ± 2 5.8 ± 0.2 (3) 16 ± 2 UD Mitragynine 5.4 ± 0.4 (5) 67 ± 12 ND (4) ND UD 7‐OH‐mitragynine 6.2 ± 0.3 (9) 77 ± 5 ND (4) ND UD Speciogynine 4.7 ± 0.3 (5) 70 ± 20 ND (4) ND UD Paynantheine 5.3 ± 0.2 (4) 95 ± 5 ND (6) ND UD TRV130 4.9 ± 0.1 (3) 74 ± 6 ND (3) ND UD MP102 5.4 ± 0.1 (5) 85 ± 8 ND (3) ND UD MP103 6.0 ± 0.1 (3) 93 ± 3 ND (3) ND UD MP105 5.2 ± 0.2 (5) 96 ± 3 ND (3) ND UD

2.6 Animals The animal protocol (#1305000864) describing the care and use of experimental animals was approved by the Purdue University Institutional Animal Care and Use Committee (https://www.purdue.edu/research/regulatory-affairs/animal-research/staff.php). Animal studies are reported in compliance with the ARRIVE guidelines (Kilkenny et al., 2010) and with the recommendations made by the British Journal of Pharmacology as well as recommendations of the National Institutes of Health Guide for the Care and Use of Laboratory Animals. For our experiments, we used adult male (19–24 g) and female (17–21 g) wild‐type C57/BL6NHsd mice (8–10 weeks old) purchased from Envigo (#044, Indianapolis, IN, USA) but originated from the National Institutes of Health. This is a strain known to readily consume alcohol (Belknap, Crabbe, & Young, 1993). We also used male δOP knockout C57BL/6 mice of similar age in a subset of experiments. δOP knockout mice were produced by removal of exon 2 as previously described (van Rijn & Whistler, 2009) and outbred to a C57BL/6 background (>10 generations). Roughly every 3 years, the strain is backcrossed to commercially obtained C57BL/6 mice (Envigo) to mitigate the effects of genetic drift. We provided food and water ad libitum unless specified otherwise for the binge ethanol experiments. With the exception of the ethanol experiments where mice were individually housed in double grommet cages, animals were group housed in plexiglass cages in ventilated racks at ambient temperature of (21°C) in a room maintained on a reversed 12L:12D cycle (lights off at 10.00, lights on at 22.00) in Purdue University's animal facility, which is accredited by the Association for Assessment and Accreditation of Laboratory Animal Care. Mice were used only in a single behavioural paradigm with the exception of the two‐bottle choice model of moderate 10% alcohol consumption (Section 2.6) in which mice received increasing doses of the test drug over multiple weeks.

2.7 Two‐bottle choice model of moderate 10% alcohol consumption Mice were trained to voluntarily consume alcohol in a limited access (4 hr·day−1), 2‐bottle choice (water vs. 10% ethanol), drinking‐in‐the‐dark protocol during their active phase (3 hr after the start of the dark cycle) until the alcohol intake was stable as previously described (van Rijn & Whistler, 2009). During the first 3 weeks of limited alcohol access, the mice increased their alcohol intake prior to reaching steady state consumption. After the completion of the third week of voluntary alcohol intake, injections were administered every Friday 30 min prior to the 4‐hr drinking session. Drug effect on alcohol and water intake was measured as a change in Friday total drinking minus average alcohol intake between Tuesday–Thursday (g·kg−1). Raw data for alcohol intake, water intake, alcohol preference and associated statistical analysis are provided in Figures S2–S4 and Table S2.

2.8 Intermittent, limited access of 20% alcohol binge consumption model To measure the effects of drug administration on binge‐like alcohol consumption, we followed a drinking‐in‐the‐dark binge protocol (Rhodes, Best, Belknap, Finn, & Crabbe, 2005; Robins et al., 2018). In this 1‐week protocol, the water bottle for each cage was replaced with a bottle containing 20% ethanol for 2 hr on Monday–Thursday (i.e. no 2‐bottle choice). Using consumption data from these 4 days, the average ethanol consumption of the mice was ranked, and the mice were sorted into groups of comparable/equal drinking (with each group to be administered either vehicle or a drug). On the Friday of the binge protocol, mice received IP injections of either vehicle or drug 30 min prior to a 4‐hr “binge” drinking period with access to 20% ethanol. To determine drug effect on alcohol intake, the amount of ethanol consumed during the binge period was compared between vehicle and drug‐treated groups. For all ethanol consumption experiments, bottle weights were measured directly before and after the ethanol access periods to the second decimal point to determine fluid intake and weights of bottles were corrected for any spillage. In addition, the location of alcohol bottles in each paradigm was alternated daily (right vs. left grommet) to prevent habit formation.

2.9 General locomotor activity assessment Square locomotor boxes from Med Associates (L 27.3 cm × W 27.3 cm × H 20.3 cm, St. Albans, VT, USA) were used to monitor locomotor activity. For all locomotor studies, animals were moved to the testing room for 60 min prior to testing for habituation. A 90‐min baseline habituation session to the boxes was conducted prior to drug administration to reduce novelty locomotor differences. The following day, mice were again habituated to the room for 60 min. Then mice were injected with drug or vehicle, and locomotor activity was monitored immediately for a total of 90 min. All testing was conducted during the dark/active phase.

2.10 Acute thermal antinociception To measure antinociception, we utilized a tail‐flick assay as previously described (van Rijn, Brissett, & Whistler, 2012b). In short, 8‐week‐old male C57BL/6 wild‐type mice (n = 10) were habituated to handling. The following day a baseline tail‐flick response was recorded using a radiant‐heat tail‐flick apparatus (Columbus Instruments, Columbus, OH, USA). The light intensity was set to “9” to produce an average baseline response of 2–3 s. We utilized a maximal cut‐off of 3× baseline to reduce the risk of damaging the mice tails. For each test, two tail‐flick responses were recorded, and the average was used for further analysis. Immediately after the baseline recording, mice were injected s.c. with saline and 30 min later, a new tail‐flick response was measured. After the saline injection, mice were injected with 1 mg·kg−1 of TRV130 (s.c.), and a third tail‐flick response was recorded 30 min post‐TRV130 injection. The experiment was repeated in the same cohort of mice on the following 2 days but using 3 and 10 mg·kg−1 of TRV130 instead. We only tested eight mice with 10 mg·kg−1 of TRV130 and injected the remaining two mice with 10 mg·kg−1 of morphine as internal control (both mice displayed 100% antinociception, data not shown). Antinociception was calculated as maximal possible effect (%MPE) = (Response drug − Response baseline )/(Response cut‐off − Response baseline ) * 100.

2.11 “Brief” conditioned place preference Mice were conditioned to drugs or vehicle as described previously (Varadi, Marrone, et al., 2015) with two modifications: (a) conditioning sessions lasted 40 min rather than 30 min and (b) a two‐chamber apparatus rather than a three‐chamber set‐up was utilized. One chamber contained a wire mesh floor and horizontal black/white striped wallpaper, whereas the second chamber contained a metal rod floor and vertical black/white striped wallpaper. To determine initial compartment bias, a vehicle injection was administered i.p. immediately prior to the pre‐conditioning session to create an unbiased, counterbalanced approach for drug pairing (half of the animals received drug on the pre‐test preferred side, while half received drug on the pre‐test non‐preferred side). Animals exhibiting >70% preference for one of the two chambers were removed from further testing. Over the following 2 days, two conditioning sessions per day were performed 4 hr apart (morning and afternoon, vehicle, or drug semi‐random) for a total of four conditioning sessions (two to vehicle on the non‐drug‐paired side and two to drug on the drug‐paired side). On the post‐conditioning testing day, a vehicle injection was administered directly before placing the animals in the testing apparatus to determine post‐conditioning preference. For all sessions, animals were habituated to the testing room 60 min before sessions, and all behaviour was conducted during the dark/active phase.

2.12 “Extended” conditioned place preference The differences between the “brief” and “extended” conditioned place preference (CPP) were as follows: (a) Conditioning sessions were 30 min instead of 40 min, (b) mice only received one conditioning session per day (none on the weekend) and (c) mice received four rather than two vehicle and drug exposures. In the initial habituation session, a vehicle i.p. injection was administered prior to the session to assess initial bias towards either chamber. An unbiased, counterbalanced approach was then used to assign the drug‐paired side for each animal (half of the animals received drug on the pre‐test preferred side, while half received drug on the pre‐test non‐preferred side). Animals exhibiting >70% preference for one of the two chambers were removed from further testing. Over the course of 2 weeks, mice were conditioned on 8 days, but with only one conditioning session per day (four to vehicle on the non‐drug‐paired side and four to drug on the drug‐paired side). On the post‐conditioning testing day, a vehicle injection was administered directly before placing the animals in the testing apparatus where animals were allowed to explore both chambers to determine post‐conditioning preference. For all sessions, animals were habituated to the testing room 60 min before sessions, and all behaviour was conducted during the dark/active phase.

2.13 Data and statistical analysis The data and statistical analysis comply with the recommendations of the British Journal of Pharmacology on experimental design and analysis in pharmacology (Curtis et al., 2018). All data are presented as means ± SEM, and analysis was performed using GraphPad Prism 8 software. For in vitro assays, non‐linear regression was conducted to determine pIC 50 (cAMP) or pEC 50 (β‐arrestin 2 recruitment). Technical replicates were used to ensure the reliability of single values, specifically each data point for binding, and arrestin recruitment was run in duplicate, and for the cAMP assay in triplicate. The averages of each independent run were considered a single experiment and combined to provide a composite curve in favour of providing a “representative” curve. In each experimental run, a positive control/standard was utilized to allow the data to be normalized, thereby providing the opportunity to calculate the log bias value which relies on the presence of the standard. For the data analysis of the behavioural experiments, we first we established that data set did not contain an outlier using the Grubbs' test. If the test revealed an outlier, this value was removed, but removal was limited to one data point per set. We then verified if the data values came from a Gaussian distribution using the D'Augostino and Pearson omnibus normality test. If the data followed a Gaussian distribution, we carried out a parametric test; otherwise, we opted for the non‐parametric test. To determine statistical differences in the means between two values, we performed a Student's unpaired t‐test if the two datasets passed the normality test, but with a Welch's correction if the datasets did not have the same SD. For those datasets, where one or both datasets did not pass the normality test, we performed a Mann–Whitney U test. Significant changes in average alcohol intake were determined by one‐way, repeated measures ANOVA with Tukey's multiple comparisons (MC) test. For CPP, two‐way, repeated measures ANOVA with Bonferroni MC was used to determine significant differences in time spent on the drug‐paired side pre‐ versus post‐conditioning. One‐way ANOVA with Bonferroni MC determined significance for locomotor studies. For the repeated measures tests, whenever we could not assume sphericity, a Geisser–Greenhouse correction was carried out by GraphPad Prism 8 software. Post hoc tests were conducted only if F in ANOVA achieved P < .05 and there was no significant variance inhomogeneity. In this study, P values <.05 were considered as statistically significant (Tables S3–S6). Whenever possible, the experimenter was blind to the drug and/or dose tested; however, we always started with the lowest drug dose if multiple doses were to be tested. Animals were assigned to groups such that the baseline responding was equal across groups. The treatment that each group received was then randomized. Group sizes were equal by design and based on a power analysis calculated using the observed deviation in our prior published work. On occasion, mice were excluded prior to drug treatment because of a failure to consume alcohol during the initial alcohol voluntary consumption phase; however, on some occasions, we started with a larger group size to account for potential non‐responders. We did not explicitly design our experiments to test for sex differences; with the exception of the study of kratom impact on binge alcohol use, male and female groups were of unequal size. To account for the unequal sample size, we utilized the Sidak's post hoc test.