Abstract Psychostimulant addiction is a heritable substance use disorder; however its genetic basis is almost entirely unknown. Quantitative trait locus (QTL) mapping in mice offers a complementary approach to human genome-wide association studies and can facilitate environment control, statistical power, novel gene discovery, and neurobiological mechanisms. We used interval-specific congenic mouse lines carrying various segments of chromosome 11 from the DBA/2J strain on an isogenic C57BL/6J background to positionally clone a 206 kb QTL (50,185,512–50,391,845 bp) that was causally associated with a reduction in the locomotor stimulant response to methamphetamine (2 mg/kg, i.p.; DBA/2J < C57BL/6J)—a non-contingent, drug-induced behavior that is associated with stimulation of the dopaminergic reward circuitry. This chromosomal region contained only two protein coding genes—heterogeneous nuclear ribonucleoprotein, H1 (Hnrnph1) and RUN and FYVE domain-containing 1 (Rufy1). Transcriptome analysis via mRNA sequencing in the striatum implicated a neurobiological mechanism involving a reduction in mesolimbic innervation and striatal neurotransmission. For instance, Nr4a2 (nuclear receptor subfamily 4, group A, member 2), a transcription factor crucial for midbrain dopaminergic neuron development, exhibited a 2.1-fold decrease in expression (DBA/2J < C57BL/6J; p 4.2 x 10−15). Transcription activator-like effector nucleases (TALENs)-mediated introduction of frameshift deletions in the first coding exon of Hnrnph1, but not Rufy1, recapitulated the reduced methamphetamine behavioral response, thus identifying Hnrnph1 as a quantitative trait gene for methamphetamine sensitivity. These results define a novel contribution of Hnrnph1 to neurobehavioral dysfunction associated with dopaminergic neurotransmission. These findings could have implications for understanding the genetic basis of methamphetamine addiction in humans and the development of novel therapeutics for prevention and treatment of substance abuse and possibly other psychiatric disorders.

Author Summary Both genetic and environmental factors can powerfully modulate susceptibility to substance use disorders. Quantitative trait locus (QTL) mapping is an unbiased discovery-based approach that is used to identify novel genetic factors and provide new mechanistic insight into phenotypic variation associated with disease. In this study, we focused on the genetic basis of variation in sensitivity to the acute locomotor stimulant response to methamphetamine which is a behavioral phenotype in rodents that is associated with stimulated dopamine release and activation of the brain reward circuitry involved in addiction. Using brute force monitoring of recombination events associated with changes in behavior, we fortuitously narrowed the genotype-phenotype association down to just two genes that we subsequently targeted using a contemporary genome editing approach. The gene that we validated–Hnrnph1 –is an RNA binding protein that did not have any previously known function in psychostimulant behavior or psychostimulant addiction. Our behavioral data combined with our gene expression results provide a compelling rationale for a new line of investigation regarding Hnrnph1 and its role in neural development and plasticity associated with the addictions and perhaps other dopamine-dependent psychiatric disorders.

Citation: Yazdani N, Parker CC, Shen Y, Reed ER, Guido MA, Kole LA, et al. (2015) Hnrnph1 Is A Quantitative Trait Gene for Methamphetamine Sensitivity. PLoS Genet 11(12): e1005713. https://doi.org/10.1371/journal.pgen.1005713 Editor: Gregory P. Copenhaver, The University of North Carolina at Chapel Hill, UNITED STATES Received: August 5, 2015; Accepted: November 9, 2015; Published: December 10, 2015 Copyright: © 2015 Yazdani et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited Data Availability: The F2 data and R code are publicly available on github (https://github.com/wevanjohnson/hnrnph1). The transcriptome dataset, and code for RNA-seq analysis are available via NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=cxkdoeaudvyhlqt&acc=GSE66366). Funding: This study was funded by the National Institutes of Health / National Institute on Drug Abuse grants R00DA029635 (CDB), R01DA039168 (CDB), and R01DA021336 (AAP) as well as the National Institutes of Health / National Human Genome Research Institute grant R01HG005692 (WEJ) and the National Institutes of Health / National Institute of General Medical Sciences T32GM008541 (NY). Funding was also provided by the Burroughs Wellcome Fund Transformative Training Program in Addiction Science #1011479 (NY). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist.

Introduction Substance use disorders (SUDs) involving psychostimulants such as cocaine and methamphetamine (MA) are heritable; however, their major genetic determinants remain poorly defined [1–4]. In particular, genome-wide association studies (GWAS) of psychostimulant abuse have yet to discover the underlying genetic factors or causal sequence variants. SUDs involve multiple discrete steps including initial use, escalation, withdrawal, and relapse, each of which is believed to have a distinct genetic architecture. Therefore, we and others have used model organisms to explore the genetic basis of intermediate phenotypes, including initial drug sensitivity [5]. Model systems have great potential for studying addiction-relevant intermediate phenotypes [6] because they provide exquisite control over environmental conditions, including exposure to psychostimulants. Psychostimulants activate the mesocorticolimbic reward circuitry in humans [7] and stimulate locomotor activity in mice [8]. The primary molecular targets of psychostimulants are the membrane-spanning monoaminergic transporters. Amphetamines act as substrates and cause reverse transport and synaptic efflux of dopamine, norepinephrine, and serotonin [9–11]. Sensitivity to the locomotor stimulant response to MA is heritable and may share a genetic basis with the addictive, neurotoxic, and therapeutic properties of amphetamines [8, 12–15]. More broadly, determining the genetic basis of sensitivity to amphetamines may provide insight into the neurobiology of other conditions involving perturbations in dopaminergic signaling, including attention deficit hyperactive disorder (ADHD), schizophrenia, and Parkinson’s disease [16]. This hypothesis is supported by our recent identification of a genetic correlation between alleles that increased amphetamine-induced euphoria and alleles that decreased risk of schizophrenia and ADHD [17]. We and others have reported several quantitative trait loci (QTLs) in mice that influence MA sensitivity [12, 18–24]. A distinct advantage of QTL analysis is that chromosomal regions can eventually be mapped to their causal polymorphisms. However, obtaining gene-level and nucleotide-level resolution can be extremely challenging when beginning with a lowly recombinant population such as an F 2 cross. A classical approach is to fine map QTLs derived from an F 2 cross using successively smaller congenic strains. Whereas this approach is efficient for Mendelian alleles, there are only a few examples in which this approach has been successful in identifying alleles for more complex, polygenic traits, such as histocompatibility [25], substance abuse [26] and depressive-like behavior [27]. In the present study, we fine mapped a QTL on chromosome 11 that modulates methamphetamine sensitivity and that segregates between C57BL/6J (B6) and DBA/2J (D2) inbred strains [12, 20]. We used interval-specific congenic lines in which successively smaller D2-derived segments were introgressed onto a B6 background [28]. We also conducted transcriptome analysis of brain tissue from a congenic line that captured the QTL for reduced MA sensitivity. Our transcriptome analysis focused on the striatum, which is a brain region important for psychostimulant-induced locomotor activity and reward [29]. We used GeneNetwork [30] and in silico expression QTL (eQTL) analysis of several brain regions to identify cis- and trans-eQTLs that may explain changes in the transcriptome caused by this QTL. Finally, to identify the quantitative trait gene responsible for reduced MA sensitivity, we used transcription activator-like effector nucleases (TALENs) to introduce frameshift deletions in the first coding exon of each positional candidate gene [31].

Materials and Methods Mice All procedures in mice were approved by the Boston University and the University of Chicago Institutional Animal Care and Use Committees and were conducted in strict accordance with National Institute of Health guidelines for the care and use of laboratory animals. Colony rooms were maintained on a 12:12 h light–dark cycle (lights on at 0600 h). Mice were housed in same-sex groups of two to five mice per cage with standard laboratory chow and water available ad libitum. Age-matched mice were 50–100 days old at the time of testing (0900–1600 h). Locomotor activity For Lines 1–6 and Lines 4a-4h, locomotor activity was assessed in the open field [19]. Briefly, congenics, subcongenics, and wild-type littermates were transported from the vivarium to the adjacent behavioral testing room where they habituated for at least 30 min prior to testing. Mice were then placed into clean holding cages with fresh bedding for approximately five min before receiving an injection of saline on Days 1 and 2 (10 μl/g, i.p) and an injection of methamphetamine on Day 3 (MA; 2 mg/kg, i.p.; Sigma-Aldrich, St. Louis, MO USA). Mice were placed into the center of the open field (37.5 cm x 37.5 cm x 35.7 cm; AccuScan Instruments, Columbus, OH USA) surrounded by a sound attenuating chamber (MedAssociates, St. Albans, VT USA) and the total distance traveled was recorded in six, 5 min bins over 30 min using VersaMax software (AccuScan). Mice heterozygous for a frameshift deletion in Hnrnph1 (Hnrnph1 +/-) or Rufy1 (Rufy1 +/-) were engineered (http://www.bumc.bu.edu/transgenic/), bred, and phenotyped at Boston University School of Medicine. Mice were bred and phenotyped in a manner similar to the congenics at the University of Chicago, with the exception that the open field was a smaller size (43.2 cm long x 21.6 cm wide x 43.2 cm tall; Lafayette Instruments, Lafayette, IN USA) and mice were recorded daily for 1 h rather than 30 min to allow a more robust detection of the phenotype. Reduced MA sensitivity was also replicated in Hnrnph1 +/- mice using the 30 min protocol (Supplementary Information). Behavior was videotaped using a security camera system (Swann Communications, Melbourne, Australia) and data were collected and analyzed using video tracking (Anymaze, Stoelting, Wood Dale, IL USA). Behavioral analysis Because our primary focus was on MA-induced locomotor activity on Day 3, we first ran a two-way repeated measures ANOVA for Day 3 using genotype and sex as factors and time as the repeated measure. Because sex did not interact with genotype or time for any of the lines on Day 3, we combined sexes for the analysis of Days 1–3 and used repeated measures ANOVA with genotype as the main factor. Main effects of genotype and genotype x time interactions were deconstructed using one-way ANOVAs and Fisher’s post-hoc test of each time bin or t-tests in cases where there were two genotypes. A p-value of less than 0.05 was considered significant. QTL analysis of F 2 mice B6 x D2-F 2 mice (N = 676) were generated, maintained, genotyped, and analyzed as previously described [20, 22]. Genome-wide QTL analysis was performed in F 2 mice using the R package QTLRel that contains a mixed model to account for relatedness among individuals [61]. We recently validated the use of permutation when estimating significance thresholds for mixed models [62]. Sex was included as an interactive covariate. For each analysis, significance thresholds (p < 0.05) were estimated using 1000 permutations. The F 2 data and R code for are publicly available on github (https:/github.com/wevanjohnson/hnrnph1). Generation of congenics and subcongenics Lines 1 and 6 were obtained from Dr. Aldons Lusis’s laboratory at UCLA (Lines “11P” and “11M” [28]) and had previously been backcrossed to B6 for more than 10 generations. These lines contained homozygous, introgressed regions from D2 on an isogenic B6 background that spanned chromosome 11. Because Lines 1 and 6 contained such large congenic intervals, we first phenotyped non-littermate offspring derived from homozygous congenic breeders versus homozygous B6 wild-type breeders (The Jackson Laboratory, Bar Harbor, ME; Figs 2 and S2) rather than heterozygous-heterozygous breeders to avoid the otherwise high likelihood of introducing unmonitored recombination events. Thus, we ensured that each individual possessed an identical genotype within each congenic line. The same type of control group is typically employed in the initial screen of chromosome substitution strains [19, 63, 64] which are essentially very large congenic lines. We crossed Line 1 to B6 and phenotyped the F 1 offspring alongside age-matched B6 mice. B6 cohorts were combined into a single group for the combined analysis of all three genotypes for Line 1 (homozygous for B6, homozygous for D2, and heterozygous; Fig 2). Next, we backcrossed Line 1 heterozygotes to B6 to generate subcongenic Lines 2–5 (Figs 2 and S2). Recombination events were monitored using genomic DNA extracted from tail biopsies and a series of TaqMan SNP markers (Life Technologies; Carlsbad, CA; S1 Table). We then used heterozygous-heterozygous breeding in Lines 2–5 to produce littermates of all three genotypes for simultaneous phenotyping (Figs 2 and S2). Because the QTL in Line 4 represented the smallest congenic region and was dominantly inherited, we backcrossed Line 4 heterozygotes to B6 to generate heterozygotes and wild-type littermates for Lines 4a-4h (Figs 3 and S3). We used additional TaqManSNP markers (Life Technologies) to monitor recombination events and defined the precise congenic boundaries using PCR and Sanger sequencing of SNPs chosen from the Mouse Sanger SNP query database (http://www.sanger.ac.uk/cgi-bin/modelorgs/mousegenomes/snps.pl [52]). Genomic coordinates are based on mm9 (Build 37). Test for residual heterozygosity in Lines 4a, 4b, 4c, and 4d We assayed tail SNP DNA from one heterozygous congenic mouse and one B6 wildtype littermate from Lines 4a-4d (eight mice total) using services provided by the DartMouseSpeed Congenic Core Facility at the Geisel School of Medicine at Dartmouth College (http://dartmouse.org/). A total of 882 informative B6/D2 SNPs were analyzed on the GoldenGate Genotyping Assay (Illumina, Inc., San Diego, CA) using DartMouse’s SNaP-Mapand Map-Synth software to determine the allele at each SNP location. After detecting a single off-target locus on chromosome 3 (rs13477019; 23.7 Mb), we used a custom designed TaqMan SNP marker for rs13477019 (Life Technologies, Carlsbad, CA USA) to confirm the result and to genotype additional samples from Lines 4a-4h for which we had both DNA and behavioral phenotypes. Data from this SNP marker were then used to test for the effect of genotype at the chromosome 3 locus on MA-induced locomotor activity. RNA-seq We harvested and pooled bilateral 2.5 mm diameter punches of the striatum for each individual sample from naïve, congenic mice and B6 wildtype littermates from Line 4a (N = 3 females and 5 males per genotype; 50–70 days old). Total RNA was extracted as previously described [23] and purified using the RNeasy kit (Qiagen, Valencia, CA, USA). RNA was shipped to the University of Chicago Genomics Core Facility where cDNA libraries were prepared for 50 bp single-end reads according to the manufacturer’s instructions using the Illumina TruSeqStranded mRNA LT Kit (Part# RS-122-2101). Purified DNA was captured on an Illumina flow cell for cluster generation and sample libraries were sequenced at eight samples per lane over two lanes (technical replicates) on the Illumina HiSeq 2500 machine according to the manufacturer’s protocols. FASTQ files were quality checked via FASTQC and possessed Phred quality scores > 30 (i.e. less than 0.1% sequencing error). Using the FastX-Trimmer from the FastX-Toolkit, the 51st base was trimmed to enhance read quality and prevent misalignment. FASTQ files were utilized in TopHat [65] to align reads to the reference genome (UCSC Genome Browser). Read counts per gene were quantified using the HTSeq Python package and the R Bioconductor package edgeR was used to analyze differential gene expression. EdgeR models read counts using a negative binomial distribution to account for variability in the number of reads via generalized linear models [66]. “Home cage” was included as a covariate in the statistical model to account for cage effects on gene expression. The p-values obtained for differential expression were then adjusted by applying a false discovery rate (FDR) method to correct for multiple hypothesis testing [67]. The transcriptome dataset and code for RNA-seq analysis are available via NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=cxkdoeaudvyhlqt&acc=GSE66366). Real-time quantitative PCR (qPCR) Oligo-dT primers were used to synthesize cDNA from total RNA to examine mRNA expression. Primer efficiencies for real-time quantitative PCR (qPCR) experiments were calculated using cycle threshold (C T ) values (SYBR® Green; Life Technologies) derived from five, 10-fold serial cDNA dilutions; efficiencies (E) ranged from 90–100% (R2 = 0.99–1). Each sample was run in triplicate and averaged. Differential gene expression was reported as the fold-change in congenic or frameshift-deleted mice relative to B6 wild-type littermates using the 2-(∆∆C T ) method [68]. Ingenuity pathway analysis (IPA) We used our differentially expressed gene list from the striatal transcriptome that contained both the log 2 fold-change and p-values (FDR < 5%) and applied IPA (www.qiagen.com/ingenuity) to identify enriched molecular pathways, functional annotations, gene networks, upstream causes, and predicted neurobiological consequences caused by inheritance of the QTL. IPA utilizes an algorithm that assumes that an increase in the number of molecular interactions indicates an increase in the likelihood of an effect on biological function. IPA uses a manually curated database (IPA Knowledge Base) containing the published literature to extract gene networks containing equally treated edges that directly and indirectly connect biologically related genes (www.qiagen.com/ingenuity). IPA analyses were conducted in February 2015. IPA settings. We considered both direct and indirect relationships that were experimentally observed or moderately-to-highly predicted in all mammalian species, including mouse and rat. We used the “stringent” setting to filter molecules and relationships in tissues and cell lines. With regard to mutations, we considered all functional effects, modes of inheritance, translational impacts, zygosity, wild-type, and unclassified mutation information. Canonical pathways. The ratio of the canonical pathways represents the number of genes in our gene list that overlap with the genes listed in the IPA-generated pathway divided by the total genes within the IPA-generated pathway; thus, a ratio equal to 1 represents perfect overlap. The–log 10 (p-value) for each canonical pathway was derived from the right-tailed Fisher’s exact that measured the degree of overlap between the number of genes identified in our list with the number of genes that comprise the canonical pathway versus the number of genes genome-wide that would be expected to overlap by chance. The p-values were corrected for multiple testing using the Benjamini-Hochberg method [67] and represent the FDR. Diseases, functions, and gene networks. The statistical significance of overlap between our gene list and a particular disease or function was assessed using the p-value derived from a Fisher’s exact test. The predicted activation state was assessed by calculating a z-score that determined the statistical significance of the match between the observed and predicted direction. “Increased” or “decreased” indicates that the Z-score was significant for predicting activated or inhibited state. IPA networks were built based on the degree of connectivity between genes within our gene list, starting with the most connected genes. Genes were added by the IPA algorithm to the network to facilitate connectivity. Networks were limited to a maximum of 35 genes to facilitate interpretability and the ability to generate hypotheses. The Network Score (see S10 Table), a.k.a., the “p score”, represents the–log 10 (p-value) and represents the probability of finding the observed number of focus genes in a network by chance. Upstream regulator analysis. This analysis identifies causal molecules associated with differential expression using both the significance and the direction of differential expression to specify causal predictions. Several plausible causal networks are constructed and used to calculate an enrichment score and p-value based on overlap between predicted and observed regulator-regulated genes (Fishers exact test). A Z-score is also calculated that determines the degree of match between observed and predicted direction of gene expression (+ or—[69]). “Increased” or “decreased” indicates that the Z-score was significant for predicting activation or inhibition of the regulator. GeneNetwork To identify published cis- and trans- eQTLs that could explain gene expression differences caused by inheritance of the Line 4a congenic interval, we queried differentially expressed genes (FDR < 20%; 174 genes total; S6 Table) in transcriptome datasets from several brain regions in GeneNetwork [30] involving BXD recombinant inbred strains (recombinant inbred strains derived from B6 and D2 strains). We considered cis- and trans-QTLs originating from SNPs located within the 50–60 Mb locus and employed an arbitrary cut-off of LRS ≥ 13.8 (LOD ≥ 3). We only included genes where there was an exact match of gene with the LRS location using the appropriate genome build coordinates for each dataset. Generation of TALENs-targeted Hnrnph1 +/- and Rufy1 +/- mice TALENs vectors encoded either the right or left arm of the TALE effector that targeted the first coding exons of Hnrnph1 or Rufy1 (Cellectis Bioresearch, Paris, France). Upon bacterial cloning and purification, TALENs vectors containing a T7 promoter were linearized and used as templates for in vitro mRNA synthesis (mMessage mMachine T7 transcription kit; Life Technologies), and purified using MEGAclear transcription clean-up kit (Life Technologies). Each mRNA cocktail was diluted in sterile buffer and injected into B6 single-cell embryos at the BUMC Transgenic Core facility (http://www.bumc.bu.edu/transgenic/). We developed a genotyping assay utilizing native restriction enzyme recognition sites within the TALENs FokI cleavage domain. Genomic DNA was extracted from mouse tail biopsies and PCR-amplified with primers targeting100 base pairs upstream and downstream of the TALENs binding domain. Amplicons were then exposed to restriction digest overnight, run on a 2% agarose Ethidium Bromide Tris-Borate-EDTA gel, and imaged with ultraviolet light. TALENs-targeted deletions were identified by the presence of undigested bands caused by a loss of the restriction site. To confirm base pair deletions in our founder lines, undigested restriction enzyme-exposed PCR amplicon bands were excised, gel-purified, and vector-ligated overnight at 4°C using the pGEM T-easy Vector Systems (Promega). The ligation reaction was transformed into MAX Efficiency DH5α Competent Cells (Invitrogen) and plated onto Ampicillin-IPTG/X-Gal LB agarose plates for blue-white selection. Following overnight incubation at 37°C, white colonies were picked, cultured in ampicillin-enriched LB medium, and amplified. The PCR product was purified using the QIAprep Miniprep kit (QIAGEN). We then sequenced the vectors for the deletions using the pGEM T7 site upstream of the insert. Genotyping of TALENs-targeted Hnrnph1 +/- and Rufy1 +/- mice An Hnrnph1 forward primer (GTTTTCTCAGACGCGTTCCT) and reverse primer (ACTGACAACTCCCGCCTCA) were designed to target upstream and downstream of the TALENs binding domain in exon 4 of Hnrnph1. Genomic DNA was used to amplify a 204 bp PCR product using DreamTaq Green PCR Mastermix (ThermoScientific). PCR products were treated with the BstNI restriction enzyme (New England Biolabs) or a control enzyme-free buffer solution and incubated overnight at 60°C to ensure complete digestion. Enzyme-treated PCR products and untreated controls were resolved in 2% agarose gel electrophoresis with 0.5 μg/mL ethidium bromide to visualize under UV light. There were two BstNI restriction sites within the Hnrnph1 amplicon that were located proximal and distal to the TALENs FokI cleavage zone. Mice heterozygous for the Hnrnph1 deletion showed two bands on the gel, while B6 controls showed a single band. Similar to Hnrnph1, a Rufy1 forward primer (AATCGTACTTTCCCGAATGC) and reverse primer (GGACTCTAGGCCTGCTTGG) targeted upstream and downstream of the TALENs binding domain in the first coding exon (exon 1). The 230 bp PCR amplicon contained a SacII restriction site that was deleted in Rufy1 +/- mice. Thus, Rufy1 +/+ mice showed a single, smaller digested band whereas Rufy1 +/- mice showed both the digested band as well as a larger, undigested band. Assessment of potential off-target deletion of Hnrnph2 in Hnrnph1-targeted mice To assess off-target activity in Hnrnph1-targeted mice, we used the UCSC genome browser to BLAT the TALENs binding domains and identified a single homologous region located within the first coding exon of Hnrnph2. We used the same PCR- and gel-based assay to test for the deletion in Hnrnph2 with the exception that we used forward (GCCACCAAGAGTCCATCAGT) and reverse primers (AATGCTTCACCACTCGGTCT) that uniquely amplified a homologous 197 bp sequence within Hnrnph2 that contained a single Bstn1 restriction site. Digestion at the Bstn1 site produced an 81 bp band and a 115 bp band.

Acknowledgments We would like to acknowledge Dr. Aldons J. Lusis for providing us with Lines 1 and 6 as well as Dr. David R. Beier (U01HD43430) and Dr. Jennifer Moran for conformational genotyping of these lines. We thank Dr. Katya Ravid, Dr. Kenneth Albrecht, and Greg Martin of the Boston University School of Medicine Transgenic and Genomic Engineering Core Facility (http://www.bumc.bu.edu/transgenic/) who aided in generating TALENs-targeted mice.

Author Contributions Conceived and designed the experiments: CDB AAP. Performed the experiments: NY CCP MAG LAK SLK JEL. Analyzed the data: NY CCP YS ERR GS RC WEJ. Contributed reagents/materials/analysis tools: RC WEJ. Wrote the paper: CDB NY AAP JEL. Suggested additional analysis used in the supplementary information: JEL.