Comparison of antibody responses to H5N1 vaccine among three IGHV1-69 genotypic groups

Individuals from a 2007 H5N1 vaccination trial were genotyped and phenotyped for IGHV1-69 CDR-H2 Phe54 F/L polymorphism (rs55891010; see Fig. 1a and methods). Their one month post-vaccination sera was competed against the anti-stem sBnAb F10 for binding to the pandemic H1CA0709 HA, which was not circulating when the serum samples were collected. Figure 1b shows a statistically significant difference in F10 blocking activity among the groups and was highest for the F/F group, followed in decreasing order by the F/L and L/L groups. The microneutralization titers (MN) for the F/F group were 1.67 and 2.29 fold higher than the mean values for F/L and L/L groups, respectively with a similar trend in their median values (Supplementary Fig. 2a). The post-vaccination hemagglutination inhibition titers (HAI) and the ELISA titers for H1CA0709 and H1CA0709 HA proteins were shown to not significantly differ from one another among the three IGHV1-69 genotypic groups (Supplementary Fig. 2b,d,e). In addition, when HAI and MN titers were compared within individuals, there was also a trend toward lower HAI/MN ratios for the F/F and F/L groups compared to the L/L individuals (Supplementary Fig. 2c). Supplementary Fig. 3 shows that stem binding activity originally boosted by H5VN04 vaccination was generally maintained within each genotypic group over the 4-year period. The similar trends observed in the analysis of the F10 competition studies, MN titers, and HAI/MN ratios supports the concept that IGHV1-69 germline polymorphism has an effect on the profile of the HA-directed Ab response, with expression from F-alleles leading to a higher Ab response to the stem domain.

Figure 1: Correlation between IGHV1-69 polymorphism and Ab response to the H5 vaccine. The pre-vaccinated sera of the 85 individuals were diluted 1/1250 and analyzed for binding activities against the anti-IGHV1-69 idiotype mAb G6. Binding activities were normalized by subtracting the G6 MSD signal with the MSD signal obtained from an isotype control, and by using a standard curve made with the IGHV1-69 F-allele-based IgG Ab D8035. (b) Post-vaccination sera (diluted 1/125) were competed with the anti-stem Ab F10 IgG for binding to H1CA0709. Cuzick’s trend test was used to further confirm that the occurrence of F-alleles increases the ability of serum to block F10 binding (L/L = 0, F/L = 1, F/F = 2). Error bars represent standard error of mean. Full size image

Effect on IGHV1-69 polymorphism on germline gene utilization and expressed HV1-69-sBnAb repertoires

To assess the role of IGH locus polymorphism on expressed IGHV1-69 germline gene repertoires ≥5 × 106 PBMCs (circa 10% B cells) were analyzed from the blood samples of 18 individuals (F/F = 4, F/L = 11, L/L = 3), collected 4 years following the H5N1 vaccine trial. The IGHV-gene frequencies from independent V(D)J rearrangements were rendered non-redundant, and IgM and IgG class determinations were made by analyzing the PCR products obtained from reverse priming with IG constant region primers. Figure 2 shows that in both the unmutated IgM (naïve) and all IgG (memory) V-segment datasets, IGHV1-69 usage was at the highest frequency in the F/F group (7.7% IgM, 3.9% IgG), intermediate frequency in the F/L group (4.7% IgM, 3% IgG), and the lowest frequency in the L/L group (1.8% IgM, 1.4% IgG). The significance of the ~3-fold difference in IGHV1-69 usage between the F/F and L/L groups was further demonstrated by noting that, in the F/F group, IGHV1-69 was the 4th and 7th most frequently used IGHV germline gene in the unmutated IgM and IgG datasets, respectively, whereas in the L/L group IGHV1-69 was ranked 18th and 23rd (data not shown). This variation in IGHV1-69 germline gene utilization was also seen for putative HV1-69-sBnAbs with the highest frequencies and correlation coefficients in individuals with F/F alleles and across the IgM B cell subset (Supplementary Fig. 4a–d). We have been able to further delineate some of these HV1-69-sBnAbs signatures through functional analyses (Supplementary Fig. 5 and text). These results demonstrate that F-allele individuals have higher levels of circulating IGHV1-69 Ab and HV1-69-sBnAb repertoires than L-allele individuals.

Figure 2: Analyzing IGHV1-69 V-segment gene utilization among the three IGHV1-69 genotypic groups. (a) The frequency of IGHV1-69 IgM clones defined by unmutated V-segments (b) the frequency of IGHV1-69 IgG clones. Error bars represent standard error of mean. Full size image

Differential effects of IGHV1-69 genotype on B cell expansion, somatic hypermutation (SHM) and evolution to HV1-69-sBnAb clones

We next investigated if other B cell functions were affected by IGHV1-69 genotype. Analysis of the naive and memory IGHV1-69 datasets within each individual’s repertoire revealed additional variation in clonal expansion, SHM frequency, and IgG-to-IgM ratios among each genotypic group. For example, the frequency of highly expanded IGHV1-69 clones (frequency > 1e-4) was greater for L/L than the F/L or F/F genotypic groups (Supplementary Fig. 6a). However, the clones of the F/F group, of which there were fewer highly expanded clones, were also significantly more mutated than those of the L/L group (Supplementary Fig. 6b). Additionally, we note that IGHV1-69 is unusual among V-genes in that these BCRs appear at a lower frequency in memory B-cells than in naïve B-cells (Supplementary Fig. 6c) (an approximately 40% reduction)14. Interestingly, this effect was strongest in individuals of the F/F genotype. These results suggest that the capacity of the IGHV1-69 B cells to undergo expansion, SHM and Ig class switching may be different among the genotypic groups.

An expanded dataset of 57 published HV1-69-sBnAbs2,9,15 was also used to investigate the effects of allele variation on SHM and VDJ recombination that results in the signature CDR-H3 amino acids G95, P96 and Y99 ± 1 (Supplementary Fig. 6d). The effects of IGHV1-69 allelic variation revealed that transition from the germline L54 to the critical F54 in L/L individuals through SHM was a rare event (Supplementary Fig. 6e), as was the occurrence of HV1-69-sBnAb CDR-H3 signatures in the IgG dataset (Supplementary Fig. 6f). The higher frequencies of V-segment amino acid substitutions at positions that are significantly enriched in HV1-69-sBnAbs in the L/L group (Supplementary Fig. 6g) suggests that these Abs may be evolving to compensate for the lack of Phe549. Collectively, this analysis implies that the scarcity of HV1-69-sBnAb in L-allele individuals was due to the underutilization of this allelic type by the immune system.

IGHV1-69 Copy Number and Regulatory Region SNPs

We further studied the potential correlation between IGHV1-69 usage and F-allele copy number (CN)12. We found a significant positive correlation between both unmutated IgM and IgG IGHV1-69 utilization and increasing IGHV1-69 F-allele copy number (Fig. 3) (Spearman, IgM r = 0.91, P < 0.0001; IgG r = 0.75, P < 0.0003). A strong positive correlation was also seen between CN and IgM but a weaker one for IgG HV1-69-sBnAb clonal frequencies suggesting that the IgG switch memory B cell subset is subject to additional regulation (Supplementary Fig. 4e,f). Interestingly, all L/L individuals were found to have a mean CN = 2 and they also had the lowest IGHV1-69 utilization among the three genotypic groups that include individuals whom lack gene duplication (Fig. 3 insets), suggesting that CNV only partially explains the lower IGHV1-69 utilization. For this reason we also investigated other genetic variants in strong linkage disequilibrium (LD) with IGHV1-69 alleles that could represent SNPs that may influence the control of transcription or V-D-J recombination rates (e.g., variants in the 5′UTR and recombination signal sequences). SNPs within the vicinity of IGHV1-69 (+− 1.5 kb; GRCh37, chr14:107168431-107171928) in LD with the F/L variant (rs55891010) were identified using data from the 1 KG phase3 dataset for African (n = 661), Asian (n = 504), and European (n = 503) populations. Only four SNPs had an r2 > 0.8 in at least one of the three populations (Supplementary Fig. 7a). Two of the identified SNPs represented additional coding variants within IGHV1-69, and the remaining two occurred upstream of the leader sequence ATG start codon. The SNP rs10220412 was found to reside in the 5′ UTR of IGHV1-69 and within a promoter initiator element16 which is also a binding motif of the B cell associated protein RUNX3, that has been shown to bind to this region in a lymphoblastoid cell line ChIP-seq dataset17 (Supplementary Fig. 7b). RUNX3 has been shown to be elevated following EBV infection or activation by PMA of primary B cells and is proposed to have a role in B cell proliferation18,19. These findings suggest that genetic factors beyond CN can influence IGHV1-69 transcript frequencies and that the rs10220412 SNP is a candidate that may affect Ab gene transcription in L-allele individuals by hindering the association of RUNX3 to this variant RUNX3/Inr site.

Figure 3: Correlating IGHV1-69 F-allele copy number with IGHV1-69 utilization. (a) Correlating F-allele CN with the frequency of IGHV1-69 IgM clones defined by unmutated V-segments. (b) Correlating F-allele CN with the frequency of IgG clones. The insets in both panels (a) and (b) describe IGHV1-69 clone frequency in individuals that lack IGHV1-69 gene duplication (Arrows point to overly of two L/L individuals). Full size image

IGHV1-69 polymorphism has broad effects on the expressed IGHV repertoire

The underutilization of IGHV1-69 germline genes in L/L individuals led us to investigate whether the F/L polymorphism was also associated with different usage frequencies of other V-genes in naïve and memory repertoires. In Fig. 4a,b V-gene frequencies were averaged across individuals within each IGHV1-69 genotypic group using data from the unmutated IgM and all IgG V-segments, respectively, and aligned according to their relative position in the IGH locus on chr14. In addition to IGHV1-69, IGHV2-70 utilization was also significantly different in both the unmutated IgM and IgG datasets with repertoire frequencies being highest for the F/F and lowest for the L/L group (Spearman, IgM r = 0.64, P = 0.0046; IgG r = 0.57, P = 0.0131) (Fig. 4c,d). This is likely explained by the fact that IGHV1-69 and IGHV2-70 reside on the same duplicated genomic segment of IGHV, and thus exhibit correlated increases in CN3 (Supplementary Fig. 1b). However, we also found evidence of more spatially separated associations between IGHV1-69 locus polymorphism and other IGHV genes. For example, in the IgG subset, IGHV4-30-4/31 usage was shown to have a significant positive correlation with the occurrence of L-alleles (r = −0.53, P = 0.0223) (Insert Fig. 4c (compare red bars)). Although IGHV4-30-4/31 does not achieve significance in the IgM dataset, it was apparent that IGHV4-30-4/31 was part of a cluster of IGHV genes which include IGH4-30-2, IGHV3-30/33rn, IGHV4-28 and IGHV3-23 (Inset Fig. 4c,d), all of which were defined by weak to moderate negative correlation coefficients (r = −0.17 to −0.53) and exhibit the highest usage frequencies in L/L individuals. To further assess V-gene usage differences in L/L individuals, we compared V-gene repertoire frequencies between the L/L group and a combined F/L-F/F group using a t-test, and visualized these differences using heatmaps (Supplementary Fig. 8a). This analysis revealed that, in the unmutated IgM dataset, IGHV3-30/33rn and IGHV4-30-2 were consistently more highly expressed in the L/L group compared to the F/L-F/F group (P < 0.05), whereas IGHV1-24, IGHV1-69, IGHV2-70 and IGHV3-49 were significantly underrepresented in L/L individuals (P < 0.05). In the IgG subset, IGHV4-30-2 and IGHV4-30-4/31 were significantly overrepresented in the L/L group (P < 0.05), and again IGHV1-69 was significantly underrepresented (P < 0.05; Supplementary Fig. 8b). Taken together, we observe that multiple clusters of V-genes within the IGHV locus are positively or negatively correlated with IGHV1-69 genotype.

Figure 4: The antibody repertoire of the three IGHV1-69 genotypic groups. V-gene frequencies were averaged for the L/L group (n = 3), F/L group (n = 11), and F/F group (n = 4) from the datasets of IgM clones characterized by unmutated V-segments (a) and IgG clones (b). The majority of the functional V-genes were tabulated according to their respective positions in the IGH locus (further detailed in Supplementary Fig. 11). Asterisks denote V-genes utilized differently among the three genotypic groups as determined by Kruskal-Wallis test (P < 0.05). Error bars represent standard error of mean. In panels (c,d) Spearman correlation coefficients are derived for the data presented in panels (a,b) with L/L = 0, F/L = 1, and F/F = 2. Asterisks indicate statistically significant correlations (P < 0.05). Red rectangles point to the location of IGHV1-69 and IGHV2-70, for which their usages were significantly different among the three genotypic groups, being the highest in the F/F group and lowest in the L/L group, in both the unmutated IgM and IgG datasets. The inset panels are enlarged cropped sections from Panel (a,b) of the IGHV4-30-4/31-to-IGHV3-23 region that is negatively correlated with F-alleles. Full size image

IGHV1-69 F/L polymorphism, copy number and gene duplication among different ethnic groups

To further investigate the association between F/L polymorphism and CN we examined published rs55891010 genotypes13 and CN3 in 288 samples from 3 broad ethnic groups (African, Asian, and European) of the 1 KG Project13. Consistent with observations from our H5N1 vaccinee cohort (Fig. 5a upper table), in the combined set of 1 KG samples we found a strong association between rs55891010 genotypes and CN, with higher mean IGHV1-69 CN in F/F (mean = 2.53) and F/L (mean = 2.46) individuals compared to individuals of the L/L genotype (mean = 2; Fig. 5a). We next partitioned these 1 KG samples by ethnicity, which revealed dramatic population differences in frequency of IGHV1-69 genotypes (Fig. 5a). A significant relationship between IGHV1-69 genotype and CN was found in Europeans, with mean IGHV1-69 CN estimates of 2.55, 2.39, and 2, for F/F, F/L, and L/L genotypic classes, respectively. This relationship, however, was not clear in the Asian population, as none of the F/F samples in our analysis were found to have greater than 2 copies of IGHV1-69 (Fig. 5a,b). Additionally, in the African population, as noted previously3, CN was higher on average overall, but in contrast to Europeans, there was a much larger fraction of F/L individuals, and the majority of these samples were found to have 3 copies of IGHV1-69. The CN trends were also corroborated by a second IGHV1-69 gene duplication assay (Supplementary Fig. 9a,b). The African group was also defined by a marked low frequency of L/L individuals (Fig. 5a,b).

Figure 5: IGHV1-69 F/L polymorphism and CN variations among various ethnicities. (a) Table including mean IGHV1-69 copy number estimates after partitioning by IGHV1-69 rs55891010 genotype, provided for the total combined population, for three broad ethnic groups and the NIH cohort samples that were analyzed by NGS. Bubble plots corresponding to IGHV1- 69 CN for each genotypic class in the total combined population (a, lower) and individual ethnic groups (b). In each plot, the area of a given circle is proportional to the number of individuals observed for that particular combination of IGHV1-69 CN and rs55891010 genotype relative to the number of samples analyzed in each group (e.g., Combined, African, Asian, or European). Full size image

Next we expanded our analysis of the IGHV1-69 rs55891010 polymorphism to include all samples of the 1 KG cohort (Supplementary Fig. 10). We found that the frequency of the L/L genotype varied considerably across human populations, with the lowest frequencies occurring in samples of African ancestry, and the highest in South Asian populations; as expected, opposite trends were noted for the F/F genotype. Taken together, these analyses indicate that interrelationships among IGHV1-69 F/L genotype, CN and IGHV loci genomic architecture likely exhibit population-specific patterns that may have broad implications for mounting broadly protective HV1-69-sBnAb responses.