Capsid (CA) proteins from different retroviruses have remarkably low sequence identity23,24. So divergent are the sequences that even amino acids demonstrated as critical in a particular retrovirus are non-conserved in another. For example, in HIV-1 CA the Y164F or G208E mutations attenuate virion production and infectivity18. Despite this essentiality for the function of HIV-1 CA, these amino acids are not individually conserved and CA from feline immunodeficiency virus (FIV) inherently harbors F and E at these positions respectively (Fig. 1A). Nevertheless, the α-helical fold of the CA protein and capsid-core assemblies are highly conserved across the various retroviruses including HIV-1, equine infectious anemia virus (EIAV), human T-cell leukemia virus type I (HTLV) and Rous sarcoma virus (RSV) (reviewed in25). There appears to be a strong selective pressure that necessitates maintenance of a specific functional structure even during the mutation and evolution of CA encoding sequences10,23,26. Indeed, a more holistic sequence comparison demonstrates the significance of coevolved substitutions and resolves the conundrum of the strict conservation of structure and function despite considerable variability in amino acid sequences. A simple sequence analysis immediately highlights that the Y164 mentioned above forms part of a conserved amino acid pair F/Y in primate viruses (HIV-1 F161/Y164) that is switched to Y/F in non-primate viruses (Fig. 1A). This suggests that the deleterious action of the single Y164F mutation in HIV-1 CA might be rescued by the acquisition of a second-site mutation, F161Y, which would restore the conserved amino acid set. To probe for such structurally essential and potentially correlated pairs in the lentiviral CA, we determined the structure of the C-terminal domain (CTD) of FIV CA (CACTD), which naturally evolved to cope with substitutions individually shown deleterious to HIV-1 CA and compared it to other non-primate and primate lentiviral CA structures available in the Protein Data Bank. Importantly, the CTD segment of CA, unlike the N-terminal domain (NTD), has been reported as biologically equivalent and functionally transferable between primate and non-primate lentiviruses23, making the interspecies comparison between FIV and HIV-1 physiologically relevant and especially convincing.

Figure 1 Preservation of FIV CACTD overall structure. (A) Sequence alignment of CACTD. Top numbering and secondary structure are shown for HIV-1 (4XFX) and bottom numbering for FIV. Identical residues are shown in bold red font and conserved equivalents in red font. Bold blue font indicates correlated sites that are different between primate (black titles) and non-primate (pink titles) lentiviruses. (B) Superposition of CACTD of FIV (magenta), HIV-1 (blue, 4XFY) and EIAV (yellow, 1EIA). F O -F C map (blue mesh, 2.5 σ) calculated after omitting corresponding fragments of hook-like residues 197–202 (C) or loop (residues 166–174) connecting α-8/α-9 (gray mesh represents 2F O -F C map at 1.0 σ) (D). Panels (C) and (D) are in walleye stereo view. Full size image

The highly preserved FIV CACTD fold reveals only minor structural subtleties

Unsurprisingly, FIV CA encoded an absolutely conserved CTD fold comprising the characteristic four α-helical bundle (α-8 to α-11), which perfectly superimposes to CA of EIAV (0.55 Å RMSD) and HIV-1 (1.14 Å RMSD in average of several structures) (Fig. 1B).

Along with maintaining the highly conserved fold, the structure of FIV CACTD also retains the delicately tuned plasticity of CA α-helices and connecting-loops, which drive and support the dynamic structural changes that CA undergoes during the viral lifecycle. In the FIV CACTD structure, the flexible 3 10 -helix adopts a rather extended conformation similar to a previous NMR observation in tubular assemblies of HIV-1 CA27 (Fig. 1B). Since this 3 10 -helix is contained within the inter-domain linker connecting CTD and NTD domains, its dynamic nature has been anticipated to allow the flexible reorganization of the two domains during CA assembly27. The loop connecting α-10 and α-11 forms a distinct hook-like structure in FIV CACTD, a feature that is not seen in the HIV-1 CA structures but is superimposable to that of the EIAV CA structure (Fig. 1B,C). This hook-like configuration appears necessary to accommodate the larger Glu, Lys and Arg side chains in non-primate lentiviruses as compared to the conserved Gly in primate viruses (discussed below). Another elastic connector is the flexible loop connecting α-8 and α-9, which was poorly resolved in the FIV CACTD structure and could not be modeled exactly according to structures from HIV-1 or EIAV as these would clash with the well-structured K174 of FIV CACTD (Fig. 1D). This loop has indeed been shown to display a wide range of conformations in reported CA structures, depending on the arrangement of the dimeric interfaces and is missing in some crystal structures28 and in the solid-state NMR analysis of tubular CA27. Flexibility of this loop correlates with the crucial pliability of α-9, which in the FIV CACTD structure was best superimposed to α-9 of HIV-1 CA in the CAI-inhibitor bound (PDB: 2BUO29) and the dehydrated crystal (PDB: 4XFY30) states (Fig. 1B). A disulfide bridge, conceivably important for CA flexibility and formed between two conserved lentiviral Cys residues (190 and 210 in FIV), was reduced in a similar manner to structures of HIV-1 CA bound to CAI29 and domain-swapped24.

A novel dimeric interface in the FIV CACTD structure mimics the binding of the CAI inhibitor helical-peptide

Flexibility of CA is most appreciated in playing a pivotal role during conformational transitions, which are triggered upon proteolytic maturation of the viral Gag-polypeptide31,32 and result in the trimeric CTD–CTD interface unique for mature capsid formation and stability30,32. Distinct CA interacting interfaces have been implicated at different stages of Gag maturation including NTD α-4 packing against a hydrophobic pocket, which is formed by α-8 and α-9 in the CTD30,33. This hydrophobic pocket was previously shown to bind small molecules30 and CAI peptide29 inhibitors and it has been proposed to interact with host cofactors10. Mutational analysis of pocket-residues suggested that CTD–CTD dimerization is very sensitive to the detailed contacts within this conserved groove and underscored an allosteric role in regulating CA reorganization33.

The asymmetric unit of the FIV CACTD crystal contains two CTD molecules packed through a novel dimeric interface that docks the α-11′ symmetry mate (denoted with the prime symbol) into this conserved pocket resembling the binding of the CAI helical-peptide and PF74 small molecule (Fig. 2). α-11′ docking at this groove also parallels the packing of α-11′ from symmetry related molecules in the crystal packing of two HIV-1 CA mutants, Y169A and L211S (PDB: 3DS2 and 3DPH33) and comparison with these structures demonstrates the spatial superposition of two conserved residues, Glu (HIV-1 E212′ and FIV E204′) and Leu (HIV-1 L211′ and FIV L203′) (Fig. 2 and inset, red helix). Indeed, a E212A mutation in HIV-1, which would disturb such α-11 docking reduces infectivity by 3-fold25, implicating a potential biologic relevance of such docking, perhaps transient, during capsid maturation or assembly.

Figure 2 FIV CACTD crystal packing. FIV CACTD dimer (magenta and gray) compared to HIV-1 CA bound to CAI helical-peptide (cyan ribbon, 2BUO), symmetry NTD α-4 (blue ribbon, 4XFZ) or PF74 inhibitor (blue sticks, 4XFZ). Symmetry HIV E71′ and FIV E204′ are shown in blue and gray lines, respectively. α-11′ from a symmetry mate of HIV Y169A mutant is shown (red ribbon, 3DS2). Red asterisk indicate position of HIV L211 (FIV L203). Inset: close up of the core around the red asterisk. Residues of HIV and FIV are shown in lines (symmetry mates denoted with a prime-symbol). Dashed magenta (FIV) and blue (HIV) lines denote 2.6 – 4.5 Å distances. Two conformations for Q67′, native (blue) and modeled (dark blue), are shown. Residue labels are colored in accordance to structure (HIV: blue, FIV: magenta, EIAV: yellow). Full size image

Walling the groove are FIV F161 (HIV-1 Y169) and L203 (HIV-1 L211) that appear essential for packing the α-11 from the adjacent monomer (Fig. 2, inset). These residues are indeed conserved in most retroviruses and while the overall intrinsic structures of HIV-1 Y169A and L211A/S CACTD mutants were not altered, cone-shaped cores and virus infectivity were completely lost, suggesting that these positions are critical for CA reorganization and assembly of mature-like particles33. Further, HIV-1 Y169 and L211 were also implicated in the docking of α-4 of NTD to this CTD groove (Fig. 2, blue helix). This packing of α-4 places HIV-1 vital and conserved E71′ (and Q67′)25 in a close spatial resemblance to the conserved E204′ of FIV α-11, mimicking comparable contacts with the backbone amide of HIV-1 L211 (FIV L203) from adjacent CA monomer (Fig. 2, inset). Therefore, the novel crystal packing of FIV CACTD molecules via the binding groove of CTD from one monomer and α-11′ of the adjacent one, while resembling reported interactions of HIV-1 CA subunits, suggests that the subtle allosteric groove of FIV CACTD is also preserved despite the different amino acid composition.

FIV CACTD coevolved-substitutions preserve the functional fold despite sequence divergence

The high sequence homology (~40% identity) between FIV and HIV-1 CACTD sufficiently encoded for a highly preserved structure. Still, 70% of the 135 tested single point mutations in HIV-1 CA yielded defective viruses18, indicating that the divergent sequences comprise coupled substitutions that compensate for the sequence alterations in delicately tuning and preserving the functional fold of CA. To uncover such compensatory correlation rules, we analyzed the divergent sets of structurally correlated amino acids at the inter- and intra-molecular interfaces facilitating such delicate structural preservation. The amino acid sets examined here were those that can directly modulate intrinsic stability, folding and self-assembly of the CA protein. Amino acid sets that can alter CA patterns in exploiting different cellular cofactors and alternative pathways were beyond the scope of this study.

At the core of HIV-1 CACTD is a conserved Y164. Mutation to Phe, naturally present at this position in other retroviruses24, causes a 36% reduction in HIV-1 single-cycle infectivity and a 69% decrease in spreading fitness18, suggesting a functional role of the Y164 hydroxyl oxygen. Indeed, structural analysis of the available HIV-1 CA structures reveals a conserved interaction (2.5 Å) between this oxygen and the backbone amide of residue 190, an interaction that appears to pin down the C-terminal end of α-9 while leaving its N-terminal end extremely flexible (Fig. 3A). In order to maintain this apparently essential structural feature in the presence of Phe rather than Tyr, non-primate lentiviruses have apparently coevolved the residue at 161 (F161 in HIV-1) as Tyr so preserving (in a flipped direction) the F161/Y164 pattern of HIV-1 as Y153/F156 (FIV CA) (Fig. 1A). This configuration both preserves the pinning interaction between Tyr and the C-terminal end of α-9 (Fig. 3A) and maintains the hydrophobic core features. Therefore, whilst the F/Y pair is imperative for structural integrity, the relative position of either residue is apparently interchangeable.

Figure 3 Structural basis of correlated substitutions. (A) Superposition of CACTD of FIV (magenta), EIAV (yellow, 1EIA), HIV-1 native (dark blue, 4XFX), dehydrated (blue, 4XFY) and domain-swapped (cyan, 2ONT) forms. HIV-1 residue numbering is shown. Dynamic N-terminus of α-9 (spheres with red asterisks: Cα of E179) is highlighted with a dashed gray arrow. Red arrow stresses steric hindrance between HIV P207 and α-9 bulge. Interactions are shown in dashed lines colored yellow (EIAV), blue (HIV) or magenta (FIV). Residue 190 denotes the static C-terminus base of α-9. (B) Trimeric CA interface of FIV (magenta) modeled per native HIV-1 structure (dark blue, 4XFX). HIV A204 (blue lines) and water molecules (red spheres) and FIV H196 (magenta lines) are shown. F O -F C map (blue mesh, 2.5 σ) calculated after omitting H196 of FIV CA. (C) Correlated residues of FIV CA (magenta ribbon) are presented in magenta spheres (Cα) and side chains (lines). Residue numbering of FIV (white font) and HIV-1 (blue font) are shown. Correlated pairs are coupled with dashed lines of black (high score) or gray (weaker score) colors and the Gremlin’s probability score5 is shown in magenta (FIV) and blue (HIV-1) font. ND: not detected. Full size image

Indeed, comparison of primate and non-primate CACTD amino acid sequences highlights a sequence pattern, which extends beyond this F/Y or Y/F pair, to include F/Y/F/L/M in primate viruses and Y/F/L/S(T)/K in FIV and other non-primate viruses (Fig. 1A). Switching the L190/M214 hydrophobic packing pair of primate viruses into the S(T)/K polar pair in non-primate viruses creates a potential hydrogen bond in the latter structure (Fig. 3A). Creation of such a polar interaction apparently requires the presence of both S(T) and K since a single substitution to L190S or M214L in HIV-1 CA resulted in either non-viable viruses or viruses with diminished virus viability and infectivity18. Therefore, the FIV inherent S182 (T190 in EIAV), at this HIV-1 190 position (Fig. 1A), may have been tolerated in the non-primate viruses by the co-acquisition of the compensatory Lys partner (K206 in FIV and K214 in EIAV, instead of HIV-1 M214) (Fig. 3A). The stabilizing effect of the HIV-1 L/M packing or FIV S/K interaction appears crucial in the accurate positioning and pinning of the α-9 C-terminal base in a similar manner to the F/Y or Y/F pair discussed above.

FIV K206 also interacts (2.7 Å) with the carbonyl oxygen of the conserved FIV P199 (HIV-1 P207) stabilizing the above-described hook-like structure of the loop connecting α-10 and α-11. In HIV-1, this loop, which has been shown to mediate the trimeric interface30 and significantly change conformation27 during CA assembly, harbors a crucial G208 residue and a G208E mutation resulted in diminished HIV-1 infectivity and spreading fitness18. FIV CA inherently possesses E200 (E208 in EIAV) at this position, however the detrimental effect of HIV-1 G208E appears to have been compensated by the coevolution of a polar side-chain in FIV, S201 (D209 in EIAV), which substitutes HIV-1 A209 and interacts (2.9 Å) with the backbone amide of residue 198 in FIV (206 of HIV-1/EIAV), which forms the hook-like structure accommodating and “pulling-back” the larger E200 residue (Fig. 1C). Indeed, a polar side-chain substitution with A209T in HIV-1 retained wild type infectivity (108%) and spreading fitness while substitution with a nonpolar side-chain, A209V, retained only 72% infectivity18.

The smaller non-primate L160 (Y/F/L/S(T)/K/H) substituting the primate F168 (F/Y/F/L/M/A) correlates better with the hook-like structure since F168 in HIV-1 causes a bulge in the middle of α-9 that would conceivably clash with conserved P207 in a non-primate hook-like structure(Fig. 3A, red arrow).

Similarly, the overall structural conservation at the delicate trimeric 3-fold interface also conceals the significant sequence variations and corresponding adaptations. The HIV-1 A204 is substituted with a bulkier H196 in FIV CA (Fig. 1A). While hydrophobic substitutions of A204 to V/L retained HIV-1 infectivity and capsid stability, A204D replacement resulted in the production of non-infectious virions with unstable and abnormal cores30,34. Interestingly, two strategically structured water molecules, positioned near A204 at the trimeric interface of native CA (Fig. 3B), were lost upon crystal dehydration resulting in tighter packing of CA subunits30. The hydration layer, especially at the two- and threefold interfaces, has been proposed to strategically stabilize these variable interfaces and complement the flexible surface of CA30. Superposing the FIV structure to the trimeric interface of HIV-1 native CA structure (PDB: 4XFX) reveals the accurate placement of the FIV H196 imidazole ring onto the two waters from each HIV-1 CA monomer (Fig. 3B). This supports the strategic location and role of these two waters in CA flexibility and highlights a strategic role for H196 in FIV CA flexibility.

Analyzing coevolved residues using the Gremlin pseudolikelihood method, which previously predicted protein structures based on coevolved distance restraints5, revealed a correlation probability of ~1.0 for the adjacent pairs within HIV-1 F/Y/F and FIV Y/F/L supporting our structural-based correlation analysis (Fig. 3C). Likewise, a correlation between the FIV H196 (or HIV A204), at the trimeric interface (Fig. 3B) and the covariant FIV E200 residue (G208 in HIV) of the hook-like loop, was successfully detected (Fig. 3C). However, a correlation between the S/K pair, which was revealed in our FIV CACTD structural analysis, was poorly predicted with 0.37 probability in FIV and missed for HIV-1 (Fig. 3C), emphasizing the power of structural analysis in revealing spatially coupled correlations.

Here we provide a simple demonstration of the power of interspecies structural analysis in elucidating functional coupling of co-evolutionarily and spatially correlated pairs of substitutions, which when otherwise uncoupled and individually assessed can mistakenly implicate fragility and propose misleading hot-spots for therapeutic targeting. Structural comparison of the FIV and HIV-1 CA reveals the mechanistic basis of functional coupling of coevolved sets that spatially cooperate to preserve a viable structure, protecting the virus from assumed genetic fragility. The ability to circumvent deleterious effects of single amino acid substitutions by cooperative secondary substitutions allows mutational flexibility that may afford viruses an important survival advantage. This mechanism provides the virus with flexibility to exploit alternative but functionally equivalent patterns when the default ones are impaired by mutations, especially during the acquisition of resistance to antiviral therapeutics and cellular restrictions. The accessibility of such hidden escape mechanisms to particular viruses can be uncovered by identifying distinct but functionally equivalent interactions naturally coevolved in relative viruses. Intrinsic differences in the different selective pressures of cell culture and whole organisms underscore the need for an adequate animal model to evaluate potential risks from newly emerging resistant virus strains and in this regard, the ability to investigate FIV coevolution in its natural host offers a unique opportunity.