Abstract The emerging field of sociogenomics explores the relations between social behavior and genome structure and function. An important question is the extent to which associations between social behavior and gene expression are conserved among the Metazoa. Prior experimental work in an invertebrate model of social behavior, the honey bee, revealed distinct brain gene expression patterns in African and European honey bees, and within European honey bees with different behavioral phenotypes. The present work is a computational study of these previous findings in which we analyze, by orthology determination, the extent to which genes that are socially regulated in honey bees are conserved across the Metazoa. We found that the differentially expressed gene sets associated with alarm pheromone response, the difference between old and young bees, and the colony influence on soldier bees, are enriched in widely conserved genes, indicating that these differences have genomic bases shared with many other metazoans. By contrast, the sets of differentially expressed genes associated with the differences between African and European forager and guard bees are depleted in widely conserved genes, indicating that the genomic basis for this social behavior is relatively specific to honey bees. For the alarm pheromone response gene set, we found a particularly high degree of conservation with mammals, even though the alarm pheromone itself is bee-specific. Gene Ontology identification of human orthologs to the strongly conserved honey bee genes associated with the alarm pheromone response shows overrepresentation of protein metabolism, regulation of protein complex formation, and protein folding, perhaps associated with remodeling of critical neural circuits in response to alarm pheromone. We hypothesize that such remodeling may be an adaptation of social animals to process and respond appropriately to the complex patterns of conspecific communication essential for social organization.

Author Summary Sociogenomics explores the relationship between social behavior and the genome. An important issue is the extent to which results from social insects can be used to understand social behavior in other animals. We address this question through computational studies of previously published experimental data on patterns of brain gene expression in honey bees in response to particular environmental conditions and stimuli. We found that for one particular stimulus, response to alarm pheromone, the set of honey bee genes differentially expressed in the brain contains disproportionately large numbers of genes also found in mammals, including humans. This enrichment in orthologous genes suggests surprisingly strong similarities in socially responsive genetic circuits common to honey bees and mammals. A large number of the human counterparts of these genes are important for regulating protein folding, protein metabolism, and regulation of protein complex formation, perhaps reflecting changes in macromolecular complexes involved in remodeling critical neural circuits in response to the alarm pheromone. Noting that alarm pheromone is a component of the honey bee’s communication system, we hypothesize that such rapid remodeling may be an adaptation in the brain cells of social animals to deal with the complex patterns of conspecific signaling essential for social organization.

Citation: Liu H, Robinson GE, Jakobsson E (2016) Conservation in Mammals of Genes Associated with Aggression-Related Behavioral Phenotypes in Honey Bees. PLoS Comput Biol 12(6): e1004921. https://doi.org/10.1371/journal.pcbi.1004921 Editor: Hans A. Hofmann, The University of Texas, UNITED STATES Received: October 4, 2015; Accepted: April 17, 2016; Published: June 30, 2016 Copyright: © 2016 Liu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: All relevant data are within the paper and its supporting information files. Funding: Support is gratefully acknowledged from grants 1DP1OD006416 from the National Institutes of Health and 0835718 from the National Science Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist.

Introduction Social behavior, like phenotypes of any level of complexity, is regulated by the activity of genomic networks and resulting gene expression. At the same time that specific examples of genes influencing behavior were being discovered empirically[1,2], the field of systems biology was developing[3]. The essence of systems biology is to use computation and genomic technologies to enable detailed observation at the sequence level of the dynamics of cell, tissue, and organism responses to specific challenges. The power of systems biology is that it enables comprehensive dynamic patterns of transcription, translation, post-translational modification, and functioning of gene products to be observed and analyzed. These approaches provide fertile ground for the development of testable hypotheses and ultimately confident inferences about the relationship between the genome and phenome (the sum total of the organism’s phenotypic traits), even when the phenome is based on complex patterns of gene interactions. The systems approach has catalyzed the development of the fields of evo-devo[4] and, more recently, sociogenomics [1]. Sociogenomics focuses on how genes influence social behavior [2] and how environmental attributes—especially those related to the social environment—influence genome activity [5]. Evo-devo has led to new insights into the molecular basis for the evolution of morphological novelties, molecular mechanisms underlying the development of morphology in the individual, and how development responds to the environment on a genomic level. Specifically, it has shown that the major (but not only) driver in evolution of form has been changes in expression patterns of functionally conserved genes [6] Synergistically, sociogenomics seeks to provide insights into the evolution of social behavior, the genomic mechanisms underlying social behavior in an individual and a species, and how social behavior is influenced by the environment at the genomic level [1]. Similar to the evolution of biological form, the evolution of a vertebrate social decision-making network has been shown to be largely (but again not entirely) by variations in conserved genes and networks [7]. One approach to sociogenomics is hypothesis-driven. In this approach, researchers begin with a hypothesis about the role of a gene or a group of genes in social behavior based on prior knowledge of the function or activity of those genes. As an example of this approach, O’Tuathaigh et al [8] observed that the knockout of the mouse ortholog of the human schizophrenia gene neuregulin 1 disrupted social novelty behavior, but left spatial learning and working memory processes intact. This gene has close homologs throughout the vertebrates, putative orthologs in arthropods, and significantly similar homologs annotated as coding for cell wall anchoring proteins in some bacteria. By contrast, systems biology studies often begin with no hypothesis (except the fundamental one that social behavior has genomic bases) and scan comprehensively to see what correlations emerge. As an example of this approach, Cummings et al [9] identified differential gene expression patterns in the response of female swordtail fish to different classes of conspecifics (attractive males, unattractive males, other females). This broad systems approach was extended across multiple species in a study in which molecular orthology and comparative brain morphology were used to identify social behavior networks in vertebrates [10]. This work nicely illustrates the above-mentioned convergence of sociogenomics and evo-devo. The studies cited above highlight the fact that understanding the genomic correlates of human social behavior requires us to use a variety of model organisms, in part because of the invasive nature of many experimental protocols. Ebstein et al [2] observed that “Human beings are an incredibly social species and along with eusocial insects engage in the largest cooperative living groups in the planet’s history.” This leads to the question: to what extent are there relevant genomic correlates between eusocial insects and humans, given that the last common ancestor of eusocial insects and humans lived approximately 670 million years ago [11] and almost certainly was very different in appearance from either an insect or a vertebrate. It may be that both eusocial insect and human social traits are elaborations and modifications of underlying patterns that were present in a common ancestor, even if the elaborations occurred independently[12]. As a corollary to this view, some species in the lineages leading to both insects and chordates would have lost or inhibited expression of these traits, while other species such as the eusocial insects and humans would have continued to express them and use them as a set of building blocks for social behavior. To the extent this is true, comparative genomics of eusocial insect social behavior and human social behavior may yield insights into some of the most fundamental aspects of the genomics of social behavior. This would be an example of the general principle that conserved elements between species separated by great evolutionary distance are likely to be universal building blocks of common aspects of the species’ phenomes [6]. Among the eusocial insects, the honey bee is a valuable model organism. Many experiments have linked brain gene expression patterns to social behavioral characteristics and environmental stimuli, and the honey bee genome has been sequenced [13]. In addition, individual members of a honey bee colony have well-defined social roles. It is known that the division of labor within the hive is based on both genetic differences between individual honey bees and also on environmental influences that include visual, tactile, and chemical signals that colony members send to each other, as well as environmental influences external to the colony [13]. However, the interplay between these factors is poorly defined with respect to variation in particular genes or regulatory domains in the genome. There are statistically detectable hereditary tendencies for particular honey bees to play particular social roles, but the individual bee’s social role is determined by the interactions between both social and environmental factors, as well as heredity. Understanding this complex interplay of internal and external factors is central to sociogenomics. One way to make a connection between honey bee and human sociogenomics is by inference of genetic orthology. Unfortunately, orthology is of necessity not verifiable in the same fashion as other techniques of bioinformatics, since it involves theoretical reconstruction of an evolutionary history that cannot be experimentally replicated. Thus, there is no reliable validation set on which to test a method. Different reasonable ways of creating orthologies may give significantly different results [14]. Whether one makes a liberal or conservative interpretation of orthological relationships produced by a particular method depends on the context, in particular whether one is concerned about contamination by false positive identifications of orthologs, or more concerned about loss of information by false negatives. In the present paper, we use a new application of orthology to test the hypothesis that the social behavior of honey bees and other metazoans, including humans, has common fundamental genomic building blocks. This paper seeks to explore the degree of relevant sequence conservation between honey bees and humans. Our starting point is the data set from Alaux et al [15], who used microarrays to analyze differential brain gene expression patterns exhibited by individual honey bees of different genetic backgrounds, engaged in different social roles and in different colony environments. African and European honey bees are subspecies of the Western honey bee, Apis mellifera, and they differ from each other in their hive defense behavior in a number of ways that have been summarized as a social behavioral counterpart to variations of threshold and intensity of the “flight or fight” response seen in vertebrate organisms; African bees are much more aggressive than European bees [16]. In general, different phenotypes may arise from either differences in gene function or from different patterns of gene expression [17]. In the African and European honey bees it is presumed that the different phenotypes are largely the result of different patterns of gene expression, and differences in the expression of hundreds of genes in the brain have been reported [15]. Bees in Alaux et al were raised in a cross-fostered experimental design to examine the influences of genetic background and social environment on brain gene expression. We analyzed the above-cited [15] data sets to explore the following two questions: 1) to what extent are the differentially expressed genes associated with social behavior in the honey bee conserved across the Metazoa; and 2) through analysis of the highly conserved genes, is it possible to infer that there are likely to be gene co-expression patterns associated with social behavior that are common to a wide range of metazoans, including humans?

Discussion This study was designed to examine the plausibility of the premise that the genomic networks underlying a response to a stimulus for social behavior (alarm pheromone response in honey bees) might have counterparts conserved in mammals, even though mammals do not use this particular alarm pheromone and the last common ancestor between honey bees and mammals lived approximately 670 million years ago [11]. Based on results from two different orthology databases, we found that the honey bee genes differentially expressed in response to alarm pheromone were more strongly conserved in orthologs to mammals than in orthologs to other metazoans, including those more closely related to the honey bee (nonsocial insects). We hypothesize that these orthologous sets are conserved remnants of a network responding to conspecific signals that first emerged in a common ancestor of insects and vertebrates and has been selectively conserved in social metazoa. The reader will have noted that the experimental context of this paper was done on material from whole brains. For processing of conspecific signals such as spoken or written language in humans, many imaging studies show that several different regions of the brain are simultaneously activated. We therefore believe that whole brain studies such as ours are useful in revealing underlying commonalities of mechanism, but should be complemented by region-specific analyses. It should be noted that this particular study deals only with those parts of the putative conserved network that are differentially expressed in response to the external signal. There may be other genes that are part of the network, but are present at relatively steady levels. This may be the reason for the conspicuous lack of genes for plasma membrane proteins in the “cellular component” category of enriched GO classes found in this study (Table 4). Plasma membrane proteins must be involved in any response to external signals, but their role in mediating between extracellular stimuli and intracellular response does not necessarily require either up- or down-regulation in immediate response to the alarm pheromone stimulus. By contrast, genes in our study whose products reside in the nucleus were upregulated, genes in the mitochondria and other organelles were downregulated, and significant numbers of genes in the remainder of the cell were differentially regulated in both directions. Our results indicate that alarm pheromone exposure triggers significant physical remodeling of intracellular molecular signaling machinery. At the core of sociality is the ability to transmit and respond to complicated signals from conspecifics [21]. This is widely thought to involve the ability of nervous systems to rapidly increase the activity of some cellular networks and reduce the activity of others in response to these signals [22]. Our results suggest that there is another level of complexity enabled by the ability to remodel macromolecular interaction networks within cells in response to a transient signal from conspecifics, such as alarm pheromone. This remodeling would allow for changes in responses to subsequent signals, i.e., for stimuli experienced presently to enable individuals to “predict” the future. Since our results are based on enrichment of orthologous genes between honey bees and mammals, this hypothesis implies the original development of this remodeling ability in an ancient common ancestor of mammals and insects. Based on these results we offer the following speculation about possible mechanisms for macromolecular remodeling within brain cells and organismic sociality. The time scales for protein folding, for binding reactions, and for assembly of macromolecular complexes from pre-existing elements, can be fractions of a second, so these processes can take place rapidly enough to be consistent with the time scale of the alarm pheromone response. However, transcription and translation of genes will take many seconds or minutes [23]. The necessarily faster time scale for the alarm pheromone response suggests involvement of a more rapid remodeling process, perhaps involving microRNAs, which have for several years been postulated to play a role in synaptic plasticity [24]. The recently developed CLIP-seq technology [25] permits comprehensive identification of microRNA binding sites in a variety of tissues, including the brain [26]. Thus it should be possible in the future to explore this speculation and experimentally characterize the roles of specific microRNA in brain remodeling in response to conspecific signals. Perhaps one aspect of the dichotomy between highly social and solitary animals is in the ability of the individual brain cells in social animals to remodel their interaction networks in response to signals from conspecifics. This ability would not come without a tradeoff, since maximal speed of response would be achieved by activating existing hard-wired networks. Thus evolutionary niches have persisted for both highly social and less social animals, with less social animals optimized for speed of response to all stimuli by activating hard-wired circuits, while highly social animals have developed the ability to remodel molecular circuits in response to signals from conspecifics—a process which results in necessarily slower response. This may also apply to the evolution of the most complex form of conspecific communication–human language. In this view the corresponding circuits underlying honey bee chemical language and human auditory language would be “phenologs”; that is, varying phenomes based on orthologous genes [27].

Methods Statistical Analyses of Ortholog Gene Counts The p-values in Table 2 for the average number of metazoan orthologs for each data set were computed as follows: For each experimental data set, random sets of matching size were sampled from the 7462 honey bee genes that were present in InParanoid database [18] and spotted on the array, and the average number of orthologs per gene was calculated for each random set. This random sampling was repeated one million times and the number of random sets with average ortholog number equal to or larger than the experimental set was counted. The count divided by 106 gave us the p-value for the average ortholog number of the test set. S1 and S2 Figs show how the p-values of the average ortholog number of Forager_CG and Alarm_Pheromone sets were calculated. The p-values for the total number of orthologs of each set for each species (Fig 3) were computed similarly. For calculating the p-value for the CG-WG difference, the KS-test p-values for the CG-WG difference for Soldier, Forager and Guard (0.026, .122, and .612 respectively) were combined using Fisher’s method [28]. For calculating the p-value for over-representation of orthologs of placental mammals in the Alarm_Pheromone set and over-representation of orthologs of insects in the Old_vs_Young set, p-values in each species (S1 Table) were also combined using Fisher’s method. In presenting and discussing the results, we use the term “conserved” to be measured by the number of orthologs that a particular sequence has; i.e., the more orthologs a gene or protein has in other species, the more “conserved” the gene is. Gene Ontology Analysis Enrichment of the conserved gene sets in particular Gene Ontology categories was determined using the functional annotation tool in the Database for Annotation, Visualization, and Integrated Discovery (DAVID)[29]. All parameters are default except that we use GO_TERM_*_ALL instead of GO_*_FAT. Extra functional analyses (of various qualities) were also included: OMIM_Disease [30], COG_Ontology [31], SP_PIR_Keywords [32], Up_Seq_Feature [33], BBID [34], BioCarta [35], Kegg_Pathway [36], Interpro Domains [37], Pir_Superfamily [38], and Smart [39]. The raw Gene Ontology results of “Eutheria-conserved”,Alarm_Pheromone genes are listed in S2 Table. Figures of Gene Ontology trees (Figs 4 and 5) were generated by Python scripts and Cytoscape [40]. Benjamini-Hochberg corrected p-values provided by DAVID are used for indication of significance [29]. Scientific references about the relationship between behavior/neural functions and genes associated with significant GO terms were identified with GeneCard and manual search with Google Scholar, using keywords “behavior”,”disease”,”neural”, and”aggression”. Identifying Metazoan Orthologs of Honey Bee Genes First, honey bee genes that showed up on the microarray studied in Alaux et al [15] were selected. This was done based on the annotation file of the Honey Bee Oligonucleotide Microarray [15]. Out of many available methods [14] of defining orthologs, two were chosen, InParanoid [18] and OrthoMCL [41]. InParanoid has the most extensive coverage of the honey bee proteome and other proteomes of completed genomes in searchable ortholog databases. Out of all these “microarray-present” honey bee genes, we identified those that are also present in InParanoid. This was done by mapping the BeeBase IDs (which are the IDs used in the data set from Alaux et al [15]) to NCBI Refseq IDs (which are the IDs used in InParanoid for honey bee). 7462 of these “microarray-present” honey bee genes are present in InParanoid. At the time of the analysis, there were 100 eukaryotic species in InParanoid with 54 of them (including Apis mellifera) being metazoan species. With S. cerevisiae added as a control, the data set used for our analysis had 55 species, which we interrogated for orthology with the 7462 InParanoid honey bee proteins.

Acknowledgments Thanks are due for useful discussion with members of the Carl R. Woese Institute for Genomic Biology research theme, “Gene Networks in Neural & Developmental Plasticity,” and C.C. Lutz for editing the manuscript.

Author Contributions Conceived and designed the experiments: HL GER EJ. Performed the experiments: HL. Analyzed the data: HL GER EJ. Wrote the paper: HL GER EJ.