Cryo-EM structure determination

Although the C-terminal extension to the S subunit is implicated in capsid assembly and RNA packaging, understanding these roles has been difficult because the normal maturation of the RNA-filled capsid involves its cleavage and dissociation. As a result no structural information for these residues is currently available. To address this deficiency, we decided to examine the structure of the CPMV eVLP particle. Crucially, not only are such eVLPs completely lacking in any encapsidated RNA, they also undergo C-terminal cleavage more slowly than wild-type virions, raising the possibility that we could determine the structure of an eVLP that retains the C-terminal segment using cryo-EM and single particle image processing. We therefore collected a cryo-EM data set for CPMV eVLP comprising ∼1,150 micrographs collected on an FEI Titan Krios microscope, using a direct electron detector (for details of data collection and image processing see Methods). Each micrograph was recorded as an exposure movie consisting of 35 frames, which were computationally corrected for microscope drift and beam-induced movement30. Particles were selected semi-automatically31, and a data set of 62.5k particles was assembled. Iterative rounds of two-dimensional (2D) and 3D classification were then used to select a homogeneous subset of 4,998 particles for 3D structure refinement (see Methods). The resulting final density map was sharpened using an empirically derived B-factor of −74.6 Å2 to 3.04 Å resolution (Fig. 2a; EMD-3014).

Figure 2: Cryo-EM structures of eVLP and CPMV-B. (a) EM density map of CPMV empty virus-like particle (eVLP) determined by cryo-EM to 3.04 Å resolution (EMDB-3014). The L subunit is shown in green, the S subunit in blue and the additionally visualized 13 amino acids in the C-terminal region of the S subunit in magenta. On the right hand side, a zoomed-in view of the boundary between L and S subunits is shown. The density for an individual β strand is shown in a mesh representation with the EM-derived atomic model within, showing clear resolution of large and small side chains. (b) Identical views as in a, but showing the EM map of CPMV containing RNA-1 (CPMV-B) to 3.44 Å resolution (EMDB-3013). Full size image

Wild-type, infectious CPMV particles containing RNA-1 (bottom fraction; CPMV-B) were collected from the bottom of a Nycodenz gradient, dialysed to remove the Nycodenz, and used for cryo-EM data collection. A data set of ∼1,750 electron micrographs was collected on the same microscope and detector as described above. Particles were selected automatically, generating a total data set of ∼72k particles. A homogeneous subset (4,331) of these particles was selected and used to determine a 3D reconstruction. The final structure for CPMV-B was sharpened using an empirically derived B-factor of −107.6 Å2 to a final resolution at 3.44 Å (Fig. 2b; EMD-3013). It should be noted that the initial starting model for the eVLP structure was a sphere with a radius of ∼155 Å. The CPMV-B structure used the eVLP model filtered to 60 Å resolution. No information from the existing X-ray structure whatsoever was therefore used to generate either of the structures presented here.

Atomic model building

As shown in Fig. 2, the resolution of both eVLP and CPMV-B maps is high enough to clearly resolve amino acid side chains in the density. We therefore decided to build de novo atomic models into the EM density rather than rely on existing atomic models for the CPMV capsid proteins (PDB 1NY74). We started with the higher resolution eVLP map, and built the polypeptide chain of a single copy of both the L and S subunit using Coot32. This preliminary model was then iteratively refined and rebuilt using REFMAC533 and Coot32 to progressively improve model quality. The resulting model contained information for the majority of the polypeptide sequence, critically including a 13-residue segment in the C-terminal region of S subunit that had never been previously visualized. The refined eVLP atomic model was then docked into the 3.4-Å CPMV-B map. Residues in the eVLP atomic model for which no density was observed for CPMV-B were deleted (residues 190–202 in S subunit) and amino acids resolved in CPMV-B but not eVLP were added and modelled (residues 184–189 in the S subunit). This preliminary (for the CPMV-B structure) model was then again iteratively refined in REFMAC5 to give the final model presented in Fig. 2b.

The structure of the C-terminal extension to the S subunit

The existing structural information for the CPMV capsid3,4 show the C terminus of S subunit after cleavage (ending at residue 189) in an extended conformation running across the exterior surface of the capsid towards a cleft between the S subunits that form the turret at an icosahedral fivefold vertex. This is precisely the conformation we see in our CPMV-B structure (see the yellow segment in Fig. 3a), but in the eVLP map we see additional density in this cleft that does not match the previously deposited structure. The density that would correspond to residues 184–189 in the C terminus is very weak suggesting this segment is poorly ordered in the particle in solution (see the yellow segment in Fig. 3b), and we have not been able to build a convincing model into this region of the map. However, it is clear that the polypeptide chain takes a steeper path along the edge of the cleft than it does once C-terminal cleavage (between residues 189 and 190) has occurred (comparison of yellow segments in Fig. 3a,b). The C-terminal segment then becomes ordered once more, and we see density corresponding to Leu190 to Arg202, residues absent from previous structures. A loop runs from the top of the S subunit back into the cleft between subunits, before forming two turns of α-helix running out of the cleft towards the bulk solvent (see magenta segment Fig. 3b). The bottom of this segment appears to be very well-ordered, with clear density for side chains that make intimate contacts to the neighbouring S subunit around the penton (Fig. 3c). The density then becomes disordered once more, with Arg202 as the last ordered residue, suggesting that the 11 C-terminal residues are disordered in solution. Intriguingly, this tallies with functional observations that while truncation of the C-terminal segment by up to 11 residues are tolerated, larger truncations (12 residues or more) dramatically reduce the yield of intact eVLPs (see Supplementary Fig. 1 and Supplementary Table 1).

Figure 3: The structure of the C-terminal extension of the S subunit. (a) EM density map of the unsharpened CPMV-B map with colours as described previously. In yellow is the C terminus of CPMV-B (amino acids 184–189), which follows the same path as the current atomic model (PDB 1NY74). The final C-terminal amino acids (190–213) are missing from both the crystal structure and the CPMV-B EM density map. (b) EM density map of the unsharpened eVLP map with colours as described previously. The density corresponding to amino acid 184–189 is coloured yellow. Although this section of the EM density is too weak to allow a polypeptide backbone to be built, we can clearly see this portion of the C-terminal moves in the eVLP map compared with the CPMV-B map (see yellow segment in a). Coloured magenta is the newly resolved 13-amino acid residue (190–202 in the S subunit). (c) Zoomed-in version of the C-terminal extension from the sharpened EM density. The new atomic model is shown inside. Full size image

The role of the C-terminal extension

The ordered C-terminal segment described for the first time here forms an intimate network of interactions with the neighbouring S subunit around the pentameric ring that forms the fivefold vertex of the particle. It is clear from the structure that hydrophobicity plays a central role in this network. Shown in Fig. 4a is the EM-derived atomic model for the eVLP represented as a surface, and coloured according to the hydrophobicity of the corresponding amino acid residues involved (see legend of Fig. 4 for details). Two phenylalanine residues in the C-terminal segment (F192 and F194) are well-resolved and appear to bind to a large hydrophobic patch on the body of the neighbouring S subunit. To test the importance of these interactions, we made mutations in the S subunit sequence and analysed their effects on both eVLP assembly and RNA packaging by the virus. While F192W has little discernable effect on eVLP assembly, it dramatically reduces the efficiency of RNA packaging, resulting in large numbers of empty capsids and systemic movement of the virus in the plant does not occur (Supplementary Table 2 and 3; Fig. 5a). Mutation of the matching hydrophobic surface on the S subunit itself (for example, V109W) has even more profound effects, preventing assembly of particles (Fig. 5b). However, the network of interactions is complex, as mutation of the other phenylalanine residue (for example, F194W) has little discernable effect other than a slightly reduced particle yield.

Figure 4: Interactions between the C-terminal extension and the neighbouring S subunit. (a) EM-derived atomic model for the eVLP S subunit and C-terminal extension represented as a surface model and coloured according to hydrophobicity (I, L and V: orange; G, A and F: pale orange; C and M: yellow). The middle panel shows the surface of the S subunit with the C-terminal extension removed, and the right panel shows the surface of the C-terminal extension that interacts with the S subunit. Hydrophobic residues are labelled. (b) As in a coloured according to charge (red is negative, blue is positive). The charged residues from S subunit are labelled. All residues in Fig. 4 are from the S subunit. Full size image

Figure 5: Residues in the S subunit that are important for particle assembly and genome encapsidation. (a) Agarose gels stained with either Coomassie blue or ethidium bromide (EtBr) show that the F192W mutant packages no RNA. Negative stain EM illustrates ‘empty’ particles in F192W mutant compared with WT. Scale bars, 100 nm. SDS–PAGE shows that the level of protein expression is comparable in F192W compared with WT in infiltrated leaves; however, F192W is unable to cause a systemic infection. (b) SDS–PAGE showing that the V109W mutation abolishes the assembly of eVLP. (c) SDS–PAGE showing R193D and E147R mutations also prevent eVLP assembly; however, the double mutation of E147R/R193D, which preserves the salt-bridge, permits very similar levels of capsid assembly compared with WT. Full size image

Despite the extensive nature of the hydrophobic surface on both C-terminal extension and the S subunit surface to which it binds, a number of charged residues also appear to play a key role. The C-terminal segment is highly basic, and side-chain density for two arginine residues is visible at the bottom of the cleft (R193 and R195; see Fig. 4b). R193 is particularly well-ordered and forms a salt-bridge to E147, again in the neighbouring S subunit. To test the importance of this interaction, we mutated these residues and assayed for eVLP assembly and viral encapsidation of RNA in vivo. Both R193D and E147R are completely unable to assemble, while the double mutant R193D/E147R, which preserves the salt-bridge but swaps its directionality, is almost indistinguishable from the wild type (Fig. 5c). R195G is similar to wild-type in terms of assembly, suggesting that it is the R193-E147 salt-bridge that is crucial for assembly (Supplementary Table 2).

Interactions between the protein capsid and genomic RNA

The way in which eVLPs are expressed means that no genomic RNA is present in the cell, so none can possibly be packaged. However, the eVLP has previously been shown to be devoid of plant cell mRNA13, including the recombinant message for either the viral coat proteins or proteinase, which are the two mRNAs that should have the highest sequence similarity to the genome12. Indeed, in the 3.0 Å structure of the eVLP there is no EM density that can be attributed to anything other than capsid proteins, which together with their very low A260/280 ratio and lack of ethidium bromide staining in agarose gels13 strongly suggest that the eVLP particles are devoid of RNA.

By contrast, the wild-type CPMV-B particle has packaged the full-length, 6-kb single-stranded RNA-1, and as expected we see significant extra density inside the capsid that we ascribe to this packaged RNA genome. However, the B-factor correction used to sharpen the map and reveal high-resolution features such as amino acid side chains acts as a strong high-pass Fourier filter, removing low-resolution features in the map such as poorly ordered molecular components like the genomic RNA. Shown in Fig. 6a is a 40-Å-thick central slab through the unsharpened CPMV-B map (at 3.63 Å resolution; the unsharpened map is also included in the deposition for EMDB-3013). As seen in the cryo-EM structures of several single-stranded RNA viruses34,35, the RNA appears as concentric shells of density. It must be noted that this density is an icosahedrally averaged picture of an asymmetric RNA molecule, so precise structural interpretation is impossible. However, several observations can be made. First, the shells of density have a thickness of ∼20 Å, and are ∼20–25 Å apart, consistent with the packing of duplex nucleic acids observed in other virus structures (for example, refs 34, 36), suggesting that extensive base pairing occurs during encapsidation. The general form of the packaged RNA is dodecahedral rather than icosahedral (each is a different realization of 532 symmetry), with the strongest RNA feature in the map forming a truncated dodecahedral cage beneath the capsid shell. This is strongly reminiscent of RNA packaging in insect viruses of the Nodaviridae, where ordered genomic RNA is packaged as a dodecahedral cage in both X-ray and EM structures37,38. The strongest CPMV density is directly beneath the twofold symmetry axes of the capsid, which are formed by the interface between two adjacent pentons, implying that this is the site where RNA binding is strongest. The density fades out as the RNA extends away from the twofold axis towards the threefold junctions that form the vertices of the truncated dodecahedron. We see two major bridges of density between the capsid shell and the RNA density that are candidates for amino acid side chains, both from the L subunit, that directly interact with RNA (Fig. 6c). These are Arg17 and Trp190, with Trp190 being by far the strongest density feature connecting the capsid shell to the RNA.

Figure 6: Density for RNA-1 in the CPMV-B structure. (a) A 40-Å thick central slab through the unsharpened CPMV-B map (at 3.63 Å resolution; unsharpened map is also included in the deposition for EMDB-3013, suggested contour level 0.015). The extra density ascribed to RNA is pink. Viral coat proteins coloured as described previously. (b) The strongest density for RNA is found beneath the capsid twofold axis, a binding site formed at the interface between two adjacent pentons. (c) Close up of viral RNA–protein interactions, demonstrating two major bridges of density between the viral RNA and the protein capsid. The density bridges are consistent with the involvement of W190 and R17 (both from the L subunit) in RNA binding. Full size image

To test the importance of these residues, we mutated each and examined the effect on eVLP assembly and RNA encapsidation in vivo (Fig. 7). Arg17 does indeed appear to be important for RNA packaging. Although R17D can be introduced into eVLPs with little effect, R17E substantially reduces eVLP capsid assembly (Supplementary Table 2; Fig. 7a). In wild-type virus, R17E and R17D abolish RNA packaging and substantially reduce capsid assembly (Supplementary Table 3; Fig. 7b). R17W, R17G and R17K are all indistinguishable from wild-type virus (Supplementary Table 3; Fig. 7b), with identical yield and systemic transport in plants, suggesting that the some degree of flexibility in the nature of the residue is tolerated. Similarly, mutations of W190A or W190D are both indistinguishable from wild-type virus, while W190F abolishes both RNA binding and capsid assembly (Supplementary Table 3; Fig. 7c).