The number of MHC class I alleles correlates negatively with the size of the T cell receptor repertoire, supporting the role of constraints associated with increasing ranges of bound antigens.

Promiscuous alleles and species with more MHC genes appear to be more common in pathogen-rich populations.

MHC alleles may differ by orders of magnitude in the range of antigens they bind.

Novel MHC alleles have been demonstrated to confer better resistance to local parasites.

Proteins encoded by the classical major histocompatibility complex (MHC) genes incite the vertebrate adaptive immune response by presenting peptide antigens on the cell surface. Here, we review mechanisms explaining landmark features of these genes: extreme polymorphism, excess of nonsynonymous changes in peptide-binding domains, and long gene genealogies. Recent studies provide evidence that these features may arise due to pathogens evolving ways to evade immune response guided by the locally common MHC alleles. However, complexities of selection on MHC genes are simultaneously being revealed that need to be incorporated into existing theory. These include pathogen-driven selection for antigen-binding breadth and expansion of the MHC gene family, associated autoimmunity trade-offs , hitchhiking of deleterious mutations linked to the MHC, geographic subdivision, and adaptive introgression.

Evidence is accumulating, as has long been suspected based on the function of MHC proteins, that pathogens impose significant selection on MHC ( Figure 1 ) and, importantly, drive MHC allele frequency changes in natural populations []. However, the specific selection mechanisms that shape the extraordinary diversity of MHC genes are still controversial ( Figure 2 , Key Figure). An associated question is whether these mechanisms can explain the evolutionary persistence of MHC allelic lineages for a much longer time than expected under neutrality, leading to], and an excess of nonsynonymous changes in MHC sequences. Yet another enigma is why MHC diversity at the individual level is limited, constraining an individual’s ability to raise an effective response to parasites, even though expressing more MHC molecules or molecular variants capable of binding a broader spectrum of antigens [] could alleviate this constraint. Recent years have brought significant progress in addressing these questions, which we review in the following sections. New studies provided clear evidence that several previously proposed evolutionary mechanisms indeed act on MHC in nature. In addition to addressing these long-standing enigmas, they also identified complexities of selection acting on MHC that have not been considered previously. These recent findings have also allowed formulation of new research questions. Here we review this recent progress and highlight outstanding and emerging questions.

Upper panel: MHC polymorphism. (A) Fast adaptation of pathogens reduces fitness of common alleles, favoring rare MHC alleles, as well as functionally novel alleles (carrying nonsynonymous mutations, incoming arrow; outcoming arrows denote alleles lost due to selection or drift; adapted from []). (B) Exposure of host genotypes to multiple pathogens can lead to heterozygote advantage (HA) when exposure takes place in a single generation, or help maintain polymorphism via fluctuating selection (FS) if exposure to different pathogens takes place in different generations. This is particularly likely if resistance is dominant and different alleles confer resistance to different pathogens, such that fitness (a product of resistance to pathogens encountered) can be highest in heterozygotes (within or across generations for HA and FS, respectively). Lower panel: Individual binding range – a composite of MHC gene number and allele-binding properties. (C) By extension of the HA mechanism, MHC duplication and divergence should increase the spectrum of antigens an individual can present, increasing the probability of immune response. However, as the number of self-peptides presented would also increase, costs may outweigh the benefits, for example, via processes that deal with autoimmunity, such as negative selection in thymus against self-reactive T cells. The same processes are likely to shape the evolution of the binding range of particular alleles. In either case, antigen-binding range should be optimized, narrowing the distribution of within-individual binding range. (D) However, beneficial alleles [subject to continuous turnover, upper panel (A)] can occur on haplotypes carrying suboptimal number of genes, widening the distribution of MHC diversity via hitchhiking (dark gray area) compared with purely optimizing selection (light gray area). In addition, deleterious mutations (red stars) linked to MHC genes can hitchhike with beneficial alleles (green circles). The blue double-headed arrow indicates that processes shaping population and individual diversity are inter-related (see the main text for details). Abbreviation: TCR, T cell receptor.

The MHC is a gene-dense region in jawed vertebrate genomes enriched for immunity genes. The classical MHC genes, which will be the subject of this review, encode glycoproteins that bind peptides, both self and non-self , inside the cell and deliver them to the surface for inspection by T cells and natural killer (NK) cells [] ( Box 1 Box 2 ). This antigen presentation is a crucial step in the adaptive immune response as it allows self/non-self discrimination by T cells, ultimately facilitating the recognition of infecting pathogens. The feature that distinguishes classical MHC genes (MHC genes hereafter) from other genes in the MHC region is their extreme polymorphism, with dozens to hundreds of allelic variants segregating in natural populations []. The polymorphism is most pronounced in the; see Glossary ), in particular at, amino-acid residues interacting directly with antigens []. Consequently, molecules coded by different MHC alleles differ in their antigen-binding profiles [], which in turn affect susceptibility to disease []. Polymorphism apparently evolves, as evidenced by the high relative nonsynonymous substitution rate within the PBD [], particularly at PBSs [], as well as by large short-term Figure 1 ). High polymorphism coupled with evidence forhas made MHC genes an attractive model for studying how selection can promote and maintain genetic variation in natural populations.

Selection coefficients (i. e., differences in relative fitness between genotypes) were identified from 19 studies (see Table S1 in the supplemental information online). Broken vertical lines are mean values by relative timescale captured by the form of selection. Asterisks denote estimates for which selection could be ascribed to pathogens. Counts are the number of selection coefficients. Heterozygote advantage, single-allele effects, and ancestral variation retention are considered to capture more recent selection events, while PBD and phylogeny-based values are considered to capture more long-term historic selection (see Table S1 in the supplemental information online for more detail on how selection coefficients were estimated). Higher estimates in some recent selection events are likely to reflect the dynamic nature of host–parasite coevolution, resulting in bouts of strong selection acting on MHC.

Given that both the NKR and MHC genetic systems are highly polymorphic but located on different chromosomes, there can be epistasis strongly affecting traits like resistance to infectious disease, susceptibility to autoimmunity, and aspects of reproduction. For instance, HLA-C expressed in human fetal trophoblasts are recognized by KIRs on maternal NK cells, with the strength of interaction between particular paternal HLA-C alleles and particular maternal KIR alleles eventually determining the blood supply to the developing embryo and pregnancy success (see Box 4 ) [].

The MHC-I molecules and their interactions with various receptors are shown from their side and top views. The nine amino acids (aa) of the peptide (in blue) interact with the MHC-I PBD formed by the α1 and α2 chains. From left to right: the T cell receptor interacts with the MHC–peptide complex, generally binding with positions 4, 5, and 6 of the peptide. The lectin-like NK receptors generally bind in two sites: site 1 is off the N-terminal end of the α1 chain, avoiding the peptide, and site 2 (receptor with broken line) is underneath the α1–α2 domain in contact with α3 and β2m. The KIRs bind the top of the α1–α2 domain including the C-terminal end of the bound peptide. The LILRs bind the MHC α3 domain and β2-microglobulin.

NKR systems typically evolve very rapidly, with both copy number variation and high allelic polymorphism. Some species have predominantly lectin-like NKRs (like Ly49 in mice), others have predominantly immunoglobulin (Ig)-like NKRs (like KIRs in humans), and still others have both (like cattle) or neither (like marine mammals). In humans, the LILR genes are located next to the KIR genes, to which they are related []. The various receptors recognize (and thus put selective pressure on) different parts of the MHC-I molecules ( Figure I ): lectin-like receptors bind under the PBD, KIRs bind the top of the PBD including the C-terminal end of the bound peptide, and LILRs typically bind the MHC α3 domain and small subunit β2-microglobulin [].

Evolution and survival of marine carnivores did not require a diversity of killer cell Ig-like receptors or Ly49 NK cell receptors.

NK cells can kill cells that lose cell surface expression of MHC-I molecules due to viral infection, cancer, or even stress. However, some NKRs evolved to recognize decoy MHC-I molecules (co-opted by viruses to prevent killing) and they activate killing. Moreover, NKRs found on T cells are involved in driving cell proliferation, as are the leukocyte immunoglobulin-like receptor (LILR, synonym LIR) molecules of myeloid cells and lymphocytes. In addition, some NK cells in humans and mice bind certain MHC-I molecules to affect proliferation of invasive trophoblasts in the placenta [].

TCRs interact with peptides bound by MHC molecule, as well as parts of PBD (formed by α1 and α2 chains in case of MHC-I, Figure I ). Some classical (and nonclassical) MHC-I molecules can be ligands for NK and myeloid cells, resulting in another level of selection beyond T cells. Some NK receptors (NKRs), notably the killer inhibitor receptors (KIRs) found in humans, also recognize portions of bound peptide, potentially influencing the PBSs [].

Nonclassical genes are related to classical MHC genes (and often difficult to distinguish from them on the basis of sequence alone) but lack one or more of their salient features – high polymorphism, wide and high expression, and presentation of peptides to T cells. Their functions vary from immune functions of many different kinds to non-immune physiology [].

(Top) The classical MHC region is approximately 3.5 Mb, comprising more than (middle) 280 genes, including those of the classical and nonclassical class I, II, and III genes. The mean recombination rate (bottom) in the class I region (0.443 cM/Mb) is lower, and in the class II region (1.712 cM/Mb) higher than the genomic average (1.2 cM/Mb); the recombination rate varies widely throughout the region (range 0.001–67 cM/Mb), which includes hotspots of extreme recombination [].

To achieve discrimination between self and non-self, T cells are ‘educated’ in the thymus: T cells, which were first positively selected based on reaction with self-peptides on the MHC molecules, are then negatively selected against strong recognition with self-peptides [].

Classical MHC molecules bind pieces of proteins (peptides) from inside cells and allow them to be recognized on the cell surface by T lymphocytes. The peptides are bound by pockets (PBSs) in a groove (part of the PBD) exhibiting extensitve sequence variation. Normally, these peptides are derived from self (host) proteins, but upon infection (or transformation in cancer), MHC molecules present non-self (pathogen or mutated) peptides to the T cells, leading to appropriate immune responses. Class I molecules bind peptides largely from the cytoplasm and contiguous structures like the nucleus, and are recognized by CD8 cytotoxic T lymphocytes. Class II molecules bind peptides largely from intracellular vesicles which are in contact with the outside of the cell, and are recognized by CD4 helper (and regulatory) lymphocytes []. MHC-I molecules are also recognized by polymorphic receptors on NK cells (see Box 2 ).

The MHC was discovered as the genetic locus leading to rapid graft rejection, which is due to highly polymorphic cell surface molecules encoded by classical class I (MHC-I) and class II (MHC-II) genes []. For most jawed vertebrates, classical MHC genes are scattered throughout a large genomic region with variable levels of recombination ( Figure I ) along with many other genes, including some involved in classical MHC function (such as TAP and tapasin genes). However, classical MHC genes are found on several chromosomes in teleost (bony) fish, and the MHC-I gene family is highly expanded with loss of MHC-II genes in Gadiform fish like the Atlantic cod [].

Distinguishing functional polymorphism from random variation in the sequences of >10,000 HLA-A , - B and - C alleles.

How Parasites Select for MHC Polymorphism

20 Spurgin L. G.

Richardson D. S. How pathogens drive genetic diversity: MHC, mechanisms and misunderstandings. 21 Piertney S. B.

Oliver M. K. The evolutionary ecology of the major histocompatibility complex. 22. Bodmer W. Evolutionary significance of the HL-A system. 23. Doherty P. C.

Zinkernagel R. M. Enhanced immunological surveillance in mice heterozygous at H-2 gene complex. 24. Takahata N.

Nei M. Allelic genealogy under overdominant and frequency-dependent selection and polymorphism of major histocompatibility complex loci. 25 Hedrick P. W. Pathogen resistance and genetic variation at MHC loci. Box 3 Mechanisms Proposed to Maintain MHC Polymorphism Parasite-Driven Mechanisms (Not Mutually Exclusive) 23 Doherty P. C.

Zinkernagel R. M. Enhanced immunological surveillance in mice heterozygous at H-2 gene complex. 105 Apanius V.

et al. The nature of selection on the major histocompatibility complex. 30 Ilmonen P.

et al. Major histocompatibility complex heterozygosity reduces fitness in experimentally infected mice. 106 De Boer R. J.

et al. Heterozygote advantage fails to explain the high degree of polymorphism of the MHC. Heterozygote advantage (HA) Because each MHC molecular variant is able to present only a limited repertoire of antigens to T cells, being heterozygote and thus expressing two different MHC proteins should increase the probability of presenting a given antigen and thus raising an adaptive immune response []. For HA to be able to maintain polymorphism, resistance to a single pathogen should be overdominant, or more plausibly, dominant with different alleles conferring resistance to different pathogen species or strains, resulting in fitness over multiple infections that is overdominant (see Figure 2 in main text) []. However, dominance of resistance appears not to be a universal feature of MHC []. Furthermore, the existing theory predicts that unless fitness contributions of different alleles to resistance are similar, HA alone can maintain much fewer alleles than observed in natural populations []. 22 Bodmer W. Evolutionary significance of the HL-A system. 39 Borghans J. A. M.

et al. MHC polymorphism under host-pathogen coevolution. Negative frequency‐dependent selection (NFDS) arises from the fact that pathogens will tend to adapt by evading presentation by the most common MHC types []. Simulations of host–parasite coevolution suggest that this mechanism is capable of maintaining high levels of MHC polymorphism []. 25 Hedrick P. W. Pathogen resistance and genetic variation at MHC loci. Fluctuating selection (FS) arises when there is variation in the presence of pathogens over time. The process can maintain MHC polymorphism under restrictions shared with HA model (dominance, similar fitness contributions of alleles), plus balanced occurrence of pathogen species in time []. Selection on MHC alleles can also vary in space, for which there is some evidence from the field (see main text) but which has been little explored theoretically. Other Mechanisms 107 Ejsmond M. J.

et al. Sexual selection and the evolutionary dynamics of the major histocompatibility complex. 108 Hedrick P. W. Female choice and variation in the major histocompatibility complex. 109 Potts W. K.

Wakeland E. K. Evolution of MHC genetic diversity: a tale of incest, pestilence and sexual preference. 110 Milinski M. The major histocompatibility complex, sexual selection, and mate choice. Mate choice for dissimilar mates can in theory maintain MHC polymorphism even in the absence of selection from parasites []. It should be stressed, however, that the evolution of such preferences requires pre-existing MHC polymorphism []. Nevertheless, mate choice for advantageous and compatible MHC genotypes can substantially affect the speed of MHC evolution []. 80 van Oosterhout C. A new theory of MHC evolution: beyond selection on the immune genes. 111 Uyenoyama M. K. Genealogy-dependent variation in viability among self-incompatibility genotypes. 80 van Oosterhout C. A new theory of MHC evolution: beyond selection on the immune genes. Sheltered load would accumulate if recessive deleterious mutations linked to MHC were hidden from selection due to high MHC heterozygosity [], analogous to a mechanism earlier postulated for plant self-incompatibility genes []. Similar to mate choice, this mechanism requires pre-existing balancing selection, but once at work it could potentially help maintain polymorphism in periods when selection from parasites is weak []. The number of alleles segregating at MHC loci is hardly matched by any other gene, and it is thus natural that the maintenance of this polymorphism has been a focus of evolutionarily oriented MHC research. Because of the functions of MHC molecules in immune response, selection by pathogens has generally been assumed the main underlying force, and has indeed been reported in multiple studies [] ( Figure 1 ). Two mechanistic explanations have been considered ever since MHC gene discovery: heterozygote advantage (HA) and rare-allele advantage or negative frequency-dependent selection (NFDS) []. Another mechanism, based on fluctuating selection (FS) over time and/or space, was proposed later on and is conceptually related to HA [] ( Box 3 ).

overdominance, whereby heterozygotes have intrinsically higher fitness than homozygotes [ 26 Penn D. J.

et al. MHC heterozygosity confers a selective advantage against multiple-strain infections. 20 Spurgin L. G.

Richardson D. S. How pathogens drive genetic diversity: MHC, mechanisms and misunderstandings. 27 Arora J.

et al. HLA heterozygote advantage against HIV-1 is driven by quantitative and qualitative differences in HLA allele-specific peptide presentation. 28 Kloch A.

et al. MHC influences infection with parasites and winter survival in the root vole Microtus oeconomus. 26 Penn D. J.

et al. MHC heterozygosity confers a selective advantage against multiple-strain infections. 29 Savage A. E.

Zamudio K. R. MHC genotypes associate with resistance to a frog-killing fungus. 30 Ilmonen P.

et al. Major histocompatibility complex heterozygosity reduces fitness in experimentally infected mice. 31 Lewontin R.

et al. Heterosis as an explanation for large amounts of genic polymorphism. 32 Stefan T.

et al. Divergent allele advantage provides a quantitative model for maintaining alleles with a wide range of intrinsic merits. 33 Wakeland E. K.

et al. Ancestral polymorphisms of MHC class-II genes – divergent allele advantage. 34. Lenz T. L.

et al. Divergent allele advantage at MHC-DRB through direct and maternal genotypic effects and its consequences for allele pool composition and mating. 35. Osborne M. J.

et al. Spatio-temporal variation in parasite communities maintains diversity at the major histocompatibility complex class IIβ in the endangered Rio Grande silvery minnow. 36. Schwensow N.

et al. Compatibility counts: MHC-associated mate choice in a wild promiscuous primate. 37 Lenz T. L. Computational prediction of MHC II-antigen binding supports divergent allele advantage and explains trans-species polymorphism. 38 Pierini F.

Lenz T. L. Divergent allele advantage at human MHC genes: signatures of past and ongoing selection. HA can arise from some degree of dominance of resistance, which allows heterozygotes to respond to a wider range of pathogens or pathogen strains compared with homozygotes ( Box 3 ), or from, whereby heterozygotes have intrinsically higher fitness than homozygotes []. HA has been extensively tested in many studies, and MHC heterozygosity was indeed sometimes reported to be associated with greater resistance to infection ( Figure 1 ; reviewed in []). However, it may be difficult to distinguish selection favoring heterozygotes from selection favoring particular alleles, which, depending on their frequency, may be present mainly in homozygotes or heterozygotes []. HA also received some support from experimental infection studies with inbred mice and with frogs [], but a study on outbred mice did not find support for HA in resistance to infection []. Even if, on average, heterozygotes are fitter than homozygotes, existing theory suggests that, similar to classical overdominance models [], HA resulting from dominant resistance can maintain only a limited number of alleles, calling into question a major role of HA in the maintenance of MHC polymorphism ( Box 3 ). This constraint is somewhat alleviated by the divergent allele advantage (DAA) version of HA []. DAA assumes that MHC heterozygotes carrying alleles with more divergent binding properties should present a larger overall repertoire of antigens, an assumption supported by correlational studies [] as well as through computational analyses [].

39 Borghans J. A. M.

et al. MHC polymorphism under host-pathogen coevolution. coevolution underlying NFDS [ 40 Woolhouse M. E. J.

et al. Biological and biomedical implications of the co-evolution of pathogens and their hosts. 20 Spurgin L. G.

Richardson D. S. How pathogens drive genetic diversity: MHC, mechanisms and misunderstandings. 41 Bolnick D. I.

Stutz W. E. Frequency dependence limits divergent evolution by favouring rare immigrants over residents. 42 Stutz W. E.

Bolnick D. I. Natural selection on MHC IIβ in parapatric lake and stream stickleback: balancing, divergent, both or neither?. 43 Phillips K. P.

et al. Immunogenetic novelty confers a selective advantage in host–pathogen coevolution. haplotype evolved improved infection success on that haplotype, but not on alternative haplotypes to which they have not been exposed [ 44 Kubinak J. L.

et al. Experimental viral evolution to specific host MHC genotypes reveals fitness and virulence trade-offs in alternative MHC types. 45 Lenz T. L.

et al. Cryptic haplotype-specific gamete selection yields offspring with optimal MHC immune genes. NFDS can in theory readily maintain observed levels of MHC polymorphism []. Furthermore, the dynamic nature of host–parasiteunderlying NFDS [] is easy to reconcile with high selection coefficients sometimes reported to act on MHC in the short term ( Figure 1 ). Yet, NFDS has been harder to demonstrate than HA. Associations of infection with MHC alleles, reported multiple times ( Figure 1 ; reviewed in []), are consistent not only with NFDS, but also with FS, HA (if resistance alleles are rare, they occur almost exclusively in heterozygotes), and even with directional selection depleting MHC polymorphism. Relating parasite load to snapshots of current allele frequencies may not be very informative either, because host–parasite coevolution may superimpose time lags on allele frequency changes. Therefore, a currently rare MHC allele that was common in the recent past can still be susceptible to pathogens and conversely, pathogens might not have adapted to a currently common, beneficial MHC allele if it increased in frequency only recently. That is probably why frequency dependence may be observed for some, but not for other snapshots of populations within the same system []. This shortcoming of snapshot studies has recently been overcome by using a system of allopatric guppy populations in which introduction of MHC alleles to which pathogens have not had a chance to adapt for many generations indeed increases resistance to a parasite []. Also consistent with fast local adaptation, experimental evolution in the laboratory showed that viruses passaged for a dozen generations on a mouse strain of a given MHCevolved improved infection success on that haplotype, but not on alternative haplotypes to which they have not been exposed []. These findings are consistent with the proposed mechanism of NFDS whereby rare MHC alleles may ‘regain’ advantage after pathogens are forced to adapt to more common MHC alleles []. Still, a cycle where a common MHC allele is evaded by pathogens becomes rare and then regains resistance has not yet been demonstrated in full.

46 de Groot N. G.

et al. Evidence for an ancient selective sweep in the MHC class I gene repertoire of chimpanzees. 47 de Groot N. G.

et al. AIDS-protective HLA-B *27/B*57 and chimpanzee MHC class I molecules target analogous conserved areas of HIV-1/SIVcpz. 48 Krause-Kyora B.

et al. Ancient DNA study reveals HLA susceptibility locus for leprosy in medieval Europe ans. 49 Hu X.

et al. Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk. 15 Eizaguirre C.

et al. Rapid and adaptive evolution of MHC genes under parasite selection in experimental vertebrate populations. 43 Phillips K. P.

et al. Immunogenetic novelty confers a selective advantage in host–pathogen coevolution. 44 Kubinak J. L.

et al. Experimental viral evolution to specific host MHC genotypes reveals fitness and virulence trade-offs in alternative MHC types. By contrast, there is some evidence that at least sometimes strong selection from pathogens may in fact reduce MHC polymorphism, as exemplified by the chimpanzee MHC-I A locus (see Box 2 for functions of Class I and II loci), which shows an order of magnitude lower diversity compared with the orthologous human leukocyte antigen (HLA)-A locus in humans, even though we would expect the reverse based on larger effective population size in chimpanzees []. This is likely an effect of simian immunodeficiency virus epidemics some 3 million years ago (mya), strongly favoring MHC alleles functionally related to those slowing AIDS progression in humans []. Conversely, a recent ancient DNA study showed that the same MHC-II DRB1*15:01 allele that confers susceptibility to the leprosy-causing Mycobacterium leprae in contemporary populations was already positively associated with leprosy infection in medieval Europe []. Still, the allele shows only a minor reduction in frequency and remains common in contemporary Europe , suggesting that selection against it was not effective, perhaps due to pleiotropic effects and resulting fitness trade-offs , such as its protective effect against type 1 diabetes []. Coupled with little evidence for adaptive evolution in M. leprae to evade presentation by MHC-II proteins, these data provide no support for NFDS, at least over the ~24 generations covered by this study. Overall, the role of NFDS in maintaining MHC polymorphism, while supported by recent evidence demonstrating preconditions necessary for it to work [], remains to be more firmly established.