Organisms are remarkably adapted to diverse environments by specialized metabolisms, morphology, or behaviors. To address the molecular mechanisms underlying environmental adaptation, we have utilized a Drosophila melanogaster line, termed “Dark-fly”, which has been maintained in constant dark conditions for 57 years (1400 generations). We found that Dark-fly exhibited higher fecundity in dark than in light conditions, indicating that Dark-fly possesses some traits advantageous in darkness. Using next-generation sequencing technology, we determined the whole genome sequence of Dark-fly and identified approximately 220,000 single nucleotide polymorphisms (SNPs) and 4,700 insertions or deletions (InDels) in the Dark-fly genome compared to the genome of the Oregon-R-S strain, a control strain. 1.8% of SNPs were classified as non-synonymous SNPs (nsSNPs: i.e., they alter the amino acid sequence of gene products). Among them, we detected 28 nonsense mutations (i.e., they produce a stop codon in the protein sequence) in the Dark-fly genome. These included genes encoding an olfactory receptor and a light receptor. We also searched runs of homozygosity (ROH) regions as putative regions selected during the population history, and found 21 ROH regions in the Dark-fly genome. We identified 241 genes carrying nsSNPs or InDels in the ROH regions. These include a cluster of alpha-esterase genes that are involved in detoxification processes. Furthermore, analysis of structural variants in the Dark-fly genome showed the deletion of a gene related to fatty acid metabolism. Our results revealed unique features of the Dark-fly genome and provided a list of potential candidate genes involved in environmental adaptation.

Here, we found that Dark-fly produced more offspring in dark than in light conditions, suggesting that Dark-fly possesses some traits advantageous in darkness. To examine genomic alterations involved in environmental adaptation, we performed whole genome sequencing for Dark-fly using NGS technology and found unique features of its genome.

In 1954, a fly population derived from one pair of Oregon-R-S flies was divided into 6 populations. Three of them (aL, bL and cL populations) were reared in normal light-dark cycling conditions and the remaining three populations (dD, eD, and fD populations) were reared in constant dark conditions. Unfortunately, all of the L lines were lost by 2002. The dD and eD lines were lost in 1965 and 1967, and only the fD line has been maintained until now. In 2008, we started to rear the fD line and designated it “Dark-fly”. We have maintained Dark-fly in a minimum medium as done before (black lines), and in a standard cornmeal medium (white lines) in parallel. The population size of Dark-fly has not been controlled but has usually been about 100 flies each in several culture vials.

We utilized NGS technology to study an unusual line of Drosophila. On November 11, 1954, the late Dr. Syuichi Mori (Kyoto University) started an experiment of maintaining a Drosophila melanogaster strain, Oregon-R-S, in constant dark conditions ( Fig. 1 ) [13] . Through 2012, this fly line, designated Dark-fly Oregon-R-S (hereafter referred to simply as “Dark-fly”) has been reared in darkness for 57 years (1400 generations). Previous studies revealed that Dark-fly showed strong phototactic ability compared to the control sister lines that had been maintained in normal light conditions [14] , [15] . It is known that flies reared in the dark become sensitive to light via physiological changes [16] . Interestingly, the phototactic ability of Dark-fly remains high even after rearing in the light for 100 generations [17] , indicating that Dark-fly seems to have lost the physiological plasticity of this trait, presumably due to genomic alterations. It was also shown that the head bristles of Dark-fly are longer than those of the wild-type strain [18] and Dark-fly maintains circadian rhythms as well as the control line does [19] . Since Dark-fly possesses eyes and pigmented cuticles and does not show apparent morphological traits related to the adaptation, it is unclear if Dark-fly is really adapted for living in the dark. Unfortunately, the control sister lines were lost during the rearing history, and only one of three replica lines reared in the dark (fD line) has survived until now ( Fig. 1 ). Therefore, it is impossible to compare Dark-fly directly with the control sisters. Nevertheless, Dark-fly is a unique organism reared long-term in a dark environment, and accordingly can be utilized for analyzing traits and genes involved in environmental adaptation. Furthermore, Dark-fly has been reared with a minimal medium, called Pearl's medium [14] . There is a considerable possibility that poor nutrient conditions influence the selective pressure in dark environments. Thus, Dark-fly might be useful for analyzing interactive effects of environmental factors on selection, which probably occur in nature.

Experimental evolution studies utilize model organisms evolved in defined environments in the laboratory, and therefore they address environmental adaptation more directly. Indeed, previous experimental evolution studies observed genomic alterations under environmental selection and evaluated the effectiveness of multiple genes on fitness [5] , [6] , [7] , [8] . Those molecular studies generally utilized unicellular organisms, such as bacteria and yeast, because of their short generation times and relatively small genomes. Experimental evolution studies using multi-cellular sexual organisms have generally been limited to analyses of trait evolution; for example increased abdominal bristle number in Drosophila [9] . Recent progress in genome science, as represented by next-generation sequencing (NGS) technology, has changed the situation by enabling us to determine the whole genome sequences of organisms from enormous output data [10] . This technology has recently been applied in some experimental evolution studies. Burke et al. showed genome sweep in Drosophila populations selected for accelerated development [11] and Zhou et al. analyzed genome features of hypoxia-tolerant Drosophila populations [12] . NGS is now starting to be used to characterize the whole genome sequences of laboratory-evolved organisms.

Organisms display traits beautifully adaptive for their environments. How organisms come to possess adaptive traits is a fundamental question for evolutionary biology. It is accepted that genomic alterations lead to diverse traits, and adaptive traits are then selected during evolutionary history. To understand the mechanisms of environmental adaptation, it is necessary to link genome to trait. Previous studies have identified genomic alterations causing evolved traits [1] , for example, skin albinism of cavefish [2] , wing spot gain of a Drosophila species [3] , and pelvic loss of freshwater sticklebacks [4] . Those studies took mainly two approaches: “candidate gene studies” examined the genes most likely involved in the trait, while “quantitative trait loci studies” characterized the whole genome but evaluated major effects of a few genes. As a next step toward understanding the molecular evolution of adaptive traits, we need to view the whole genome sequence of the evolved organisms and to evaluate the effects of multiple genes. However, it is difficult to estimate the selective pressure on genes in natural environments, because the environments in nature are so diverse that the selective pressure is modulated by multiple environmental factors in a complicated manner.

Results

Non-synonymous SNPs and coding InDels were concentrated in some gene families Since Dark-fly displays some traits advantageous for living in the dark, it should carry some genomic alterations related to these traits. Even if so, most of the SNPs we found would be expected to be functionally neutral and only a small fraction of the SNPs should contribute to the traits. To evaluate the Dark-fly SNPs, we categorized each SNP by its position relative to gene structures, such as intergenic regions and gene coding regions. Since one SNP often affects several isoforms of a gene or several overlapping genes simultaneously, the 415,626 SNPs of Dark-fly were classified to 1,435,028 SNP-effects (Table 2). It is not easy to evaluate SNPs in intergenic regions, and accordingly we focus on the coding SNPs hereafter. 6.7% of the SNP-effects were synonymous SNPs (sSNPs: i.e., they do not alter amino acid sequences of gene products), and 1.8% were non-synonymous SNPs (nsSNPs: i.e., they change the amino acid sequence) (Table 2). We collected the Dark-fly-specific nsSNPs without redundancy between isoforms and identified 4,323 genes carrying nsSNPs. We performed similar processes for the Oregon-R-S genome and identified 3,039 such genes. An InDel is an insertion or deletion of a few nucleotides and can be detected by analyzing the NGS data. We identified 5,322 and 5,461 InDels for Dark-fly and Oregon-R-S, respectively, and 662 of these InDels (12.4% for Dark-fly) were shared between them (Table 2). We classified each InDel by its position relative to gene structures, by a process similar to that performed for SNP analysis. InDels in gene coding regions (cInDels) would result in codon-deletion, codon-insertion, or frame-shift of gene products, so that the effects of cInDels would be severe, like those of nsSNPs. We identified 50 and 27 cInDels specifically found in Dark-fly and Oregon-R-S, respectively (Table 2). We then asked whether the nsSNP or cInDel-carrying genes are concentrated in any gene families in the Dark-fly genome. Using the web-based tool DAVID [22], we identified 20 Gene Ontology (GO) families (by molecular function category) that contained nsSNPs or cInDels at higher probability than the average for all genes throughout the genome (p-value<0.05, Table S1). Among them, 4 GO families, including families associated with metal ion binding (GO:0046872) and UDP-glycosyltransferase activity (GO:0008194), were shared between Dark-fly and Oregon-R-S (* in Tables S1 and S2), suggesting that these genes might have been commonly subject to mutations. The remaining 16 GO families were found specifically for Dark-fly (Table S1). These include families associated with carboxylesterase activity (GO:0004091) and guanyl-nucleotide exchange factor activity (GO:0005085). Thus, these gene families have accumulated nsSNPs and cInDels in the Dark-fly genome.

Nonsense mutations were identified in the Dark-fly genome Among nsSNPs, a nonsense mutation produces a stop codon in the amino acid sequence of a gene product, and may severely affect the protein's function. We identified 28 nonsense mutations in the Dark-fly genome (Table S3). Among them, 10 mutations (for example, in the Hn and HisCl1 genes) were located in a subset of a gene's isoforms, so that the nonsense mutation might be complemented by redundant function(s) of other isoform(s). The remaining 18 mutations were located at sites shared by all of the gene's isoforms or at sites of the gene encoding a unique transcript, so that functional consequences of these mutations would be inevitable. These genes included an olfactory receptor (Or65c) and a light receptor (Rh7) genes. Indeed, the Dark-fly nonsense mutations were preferentially concentrated to one GO family associated with sensory perception (BP_5 category: GO:0007600, data not shown). We also detected a similar number of nonsense mutations (23 mutations) in the Oregon-R-S genome (Table S4), but those were not concentrated to any GO families.

nsSNPs and cInDels in ROH regions We further characterized the Dark-fly ROH regions and identified 241 genes containing nsSNPs and/or cInDels (Table 4). GO analysis for the 241 genes listed 3 families (Table S9). One of them is associated with carboxylesterase activity (GO:0004091), and two of them are related families associated with small GTPase regulator activity (GO:0005083) and guanyl-nucleotide exchange factor activity (GO:0005085). Interestingly, both families of carboxylesterase and guanyl-nucleotide exchange factor were also listed by the aforementioned GO analysis of total nsSNPs and cInDels (Table S1). Carboxylesterase genes are located as a cluster at the ROH ID#20 region on chromosome 3R (Table 4). Carboxylesterase is a family of the enzymes hydrolyzing esters, and the alpha-esterase class listed here is involved in xenobiotic matabolism [27]. Guanyl-nucleotide exchange factors (GEFs) are regulators of small GTPases involved in various biological processes, such as neural development and activity [28]. These and other genes that carry nsSNPs and cInDels in the ROH regions are potential candidate genes related to the selected traits of Dark-fly (Table 4, File S1).