Structure determination

For structural analysis, the ectodomain of AMN (20–357) and an N-terminal fragment of cubilin (26–135) were co-expressed in Escherichia coli. Crystals of the AMN(20–357)–cubilin(26–135) complex were obtained and the structure determined to 2.3 Å resolution by the single anomalous diffraction method (Table 1). Electron density is observed for the entire ectodomain of AMN and for cubilin residues 37–120. The final model displays good crystallographic and geometric statistics with R work /R free = 0.21/0.23.

Table 1 Data collection, phasing and refinement statistics Full size table

Structure of AMN(20–357)

AMN forms a bean-shaped structure protruding ~60 Å from the plasma membrane and is composed of four structural entities (Fig. 1). A SEA (for Sperm protein, Enterokinase and Agrin) domain is located in the C-terminal part of AMN directly preceding the transmembrane helix17. The SEA domain has a classic βαββαβ-fold18 found in various types of proteins including initiation and elongation factors19. A cysteine-rich region links the SEA domain to the N-terminal part of AMN. With the exception of a single α-helix the cysteine-rich region has no apparent secondary structure. The fold of the two N-terminal domains of AMN (denoted β-helix 1 and β-helix 2) is similar and is formed by right-handed β-helices with hydrophobic cores. Both domains are shaped as inverted three-faced pyramids but they differ substantially in the base regions. The base of β-helix 1 is engaged in interactions with cubilin, whereas the base of β-helix 2 is capped by two amphipathic α-helices that segregate the hydrophobic core of the domain from the solvent.

Fig. 1 Structure of AMN-Cubilin. a Cartoon representation of the crystal structure of ectoAMN (residues 20–357, blue) in complex with trimeric cubilin (residues 36–135, shades of green). Disulphide bonds are shown as yellow sticks. b Topology diagram of the cubilin trimer. c Topology diagram of AMN. d Close-up views of selected residue surrounding the potentially N-linked glycosylated AMN residues, Asn35 and Asn39. Hydrogen bonds and ionic interactions are marked by dashed black lines. Water molecules are shown as red spheres. e N-linked glycosylation modelled on Asn39. The black box represents the close-up view shown in d Full size image

AMN has ~3 kDa of Asn-linked oligosaccharides as demonstrated by PNGase F digestion of AMN purified from solubilised human kidney membranes13. Two consensus sequences (Asn-X-Ser/Thr) are present in AMN for potential attachment of N-linked oligosaccarides: Site I for glycosylation of Asn35 and site II for glycosylation of Asn39. Both Asn35 and Asn39 are located in the apex of β-helix 1 (Fig. 1d). Asn35 is buried within β-helix 1 and engages in hydrogen bonding with Asn27 and Arg109. As Asn35 is not solvent exposed, attachment of an oligosaccharide would require major rearrangements of AMN β-helix 1. Asn39 on the other hand is solvent exposed and points towards the SEA domain. A mature N-linked glycosylation can be modelled on Asn39 without any clashes or rearrangements (Fig. 1e).

Structure of cubilin

Three cubilin (residues 26–135) chains join to form a pin-shaped molecule with an approximate diameter of 30 Å and a length of 80 Å (Fig. 1a). The head is formed by a three-turn intertwined β-helix, which irreversibly interlocks three cubilin chains. The β-helix is extended with a single β-hairpin at the N-terminal of each chain. In the C-terminal region three cubilin chains form a coiled-coil, which is interrupted midway owing to the presence of a proline residue (Pro103). This hinge-region introduces a certain degree of flexibility in the coiled-coil region as we observe a dramatic increase of B-factors in the region following the hinge (Supplementary Figure 1). A potential glycosylation site on Asn105 is also located in the hinge region and oligosaccharides can be modelled on Asn105 with minor rearrangements (Supplementary Figure 2). No interpretable electron density is observed for cubilin residues 121–135, which is predicted to extend the coiled-coil region leading up to the first EGF-like domain. These residues will extend the length of the cubilin N-terminal region by 15 Å giving it a total length of 75 Å.

AMN–cubilin interface

The cubilin interface with AMN is formed by the N-terminal strands of three cubilin chains (residues 42–49), which combine into a triangular face with a hydrophobic centre formed by Met44 (Fig. 2a). This structure is complementary to the triangular base of AMN β-helix 1. Here the hydrophobic centre is formed by AMN residues Met69, Leu71, Leu77, Leu79 and Phe85 (Fig. 3b). The two hydrophobic faces form extensive Van der Waals interactions shielding the hydrophobic areas from the solvent. In addition, the β-strands at the triangle perimeter form anti-parallel β-sheets via main-chain hydrogen bonds (Fig. 2c–e). Only few interactions are formed between the side chains of cubilin and AMN. Although the structure of the three cubilin chains that constitute the AMN interface are more or less identical and display three-fold symmetry, their interactions with AMN are markedly different (Fig. 2c–e). This is most likely caused by the asymmetric nature of the AMN β-helix 1. Alignment of AMN and cubilin sequences from distantly related organisms reveals that the hydrophobic residues forming the cores of the β-helical domains of both AMN and cubilin are highly conserved (Fig. 3a–d). This suggests that the cubam receptor architecture with three cubilin chains anchored to AMN via β-helix–β-helix association is conserved among the species.

Fig. 2 Interaction between AMN and cubilin. a AMN interface of the trimeric cubilin β-helix. The hydrophobic centre of the β-helix is formed by Met44 residues. b Cubilin interface of AMN. AMN in b is related to cubilin in a by a vertical rotation of 180°. c–e Interactions between the three individual polypeptide chains of cubilin with AMN. Residues directly involved in interactions are shown as sticks. Hydrogen bonds and ionic interactions are marked by dashed black lines Full size image

Fig. 3 Sequence alignment of AMN and cubilin β-helices. a Sequence alignment of AMN β-helix 1. b Sequence alignment of the cubilin β-helix. Secondary structure elements are shown above the alignment. Fully conserved residues are marked by dark grey boxes. Hydrophobic residues (Ala, Val, Leu, Ile, Met, Phe) are marked by light grey boxes. Residues constituting the core of the β-helices are marked by asterisks below the alignment. c and d Cartoon representation of AMN β-helix 1 (c) and the cubilin β-helix (d). Hydrophobic residues forming the core of the β-helices are shown as grey sticks Full size image

Electron microscopy (EM) on full-length cubam

In order to visualise the entire cubam receptor, we purified full-length cubam from solubilized porcine kidney membranes using immobilised human IF-B 12 and performed negative-stain EM. In the presence of Ca2+, the electron micrographs reveal 700–800 Å long tree-shaped structures with an ~400 Å long stem and a globular crown-region (Fig. 4a, c). Two distinct populations of receptors are observed (marked by arrows in Fig. 4a), a population with a single stem and a small crown and a population with two more or less parallel stems and a larger crown, suggesting dimerisation of the receptor complex. When ethylenediaminetetraacetic acid (EDTA) is added, the electron micrographs show a markedly different morphology (Fig. 4b). These micrographs resemble previously published electron micrographs of cubilin11 (probably in complex with AMN) and show partially unfolded structures with the three cubilin chains as individual lobes connected via interaction with AMN. These results show that Ca2+-binding is important for the structural integrity of the receptor by stabilising individual domains (CUB and EGF-like) and/or by mediating interactions between cubilin chains.

Fig. 4 Negative stain electron microscopy and modelling of full-length cubam. a Typical electron micrograph showing two different forms of cubam (marked by arrows)—a single-stem form with a small head domain and a double-stem form with a larger head domain. A 100 nm scale bar is shown in the lower left-hand corner. Class average of 758 particles of the double-stem form is shown in the lower right-hand corner. Meaningful class-averages of the single-stem form of cubam could not be obtained due to inhomogeneous particles. b Typical electron micrograph showing partly unfolded cubam in the presence of 10 mM EDTA. Ca2+-binding motifs are found in several of the CUB and EGF-like domains of cubilin. Receptor unfolding in the presence of EDTA suggests that Ca2+-binding serves to maintain the structural integrity of cubam. c Schematic representation of the single-stem form of cubam with approximate dimensions of “stem” and “crown” regions. d Schematic representation of the double-stem form of cubam. The top-left box shows a schematic representation of the cubam location the enterocyte brush-border membrane with approximate proportional sizes of cubam and the microvilli. The predicted IF-B 12 -binding region of cubam is marked by a box and the structure of CUB 5–8 -IF-B 12 12 is shown as enlargement with the B 12 structure visualised in the original purple colour. We anticipate that the stem of cubam is formed by AMN and the C-terminal region of cubilin including the eight EGF-like domains. The stem of cubam is marked by a box and a model of the stem is shown as enlargement Full size image

The observed stem of cubam is likely formed by AMN and the cubilin N-terminal region including the eight EGF-like domains. EGF-like domains are small compact domains composed of around 40 residues and are often present as consecutive repeats20. A consensus Ca2+-binding motif is frequently found in the interface between adjacent EGF-like domains and may serve to stabilise the conformation of two consecutive domains21. In cubilin, consensus Ca2+-binding motifs are present in the interfaces between EGF-like domains 1–2, 3–4, 4–5 and 7–8. Several structures have shown that consecutive EGF-like domains form rod-like structures with almost linear arrangement22,23,24,25. Based on the structure of human Notch-1 we have modelled the eight EGF-like domains of cubilin and attached them to the structure of AMN–cubilin (Fig. 4d). The resulting model has a length of ~400 Å, which perfectly matches the length of the stem observed in the electron micrographs. The physiological role of the stem might well be to place the ligand-binding regions further from the membrane, which may be advantageous for catching multiple ligands in the fluid passing the apical cells.

The crown of cubam in the electron micrographs is most likely constituted by the 27 CUB domains from each cubilin subunit. From the micrographs it appears that the CUB domains are organised in a more or less ordered structure that likely exposes the ligand-binding domains to the surrounding fluids, while other domains are packed in the interior.

IGS mutations

So far, 69 IGS causing mutations have been described (Human Gene Mutation Database version 2017.126). Non-sense mutations that introduce pre-mature stop codons or mutations that affect splicing-events readily explain absence of cubam expression27. However, except for those mutations in cubilin that directly affects the IF-B 12 -binding region12,28, less is known about the causes of missense mutations leading only to a single amino acid substitution (Supplementary Table 2).

A single amino acid substitution that causes IGS is the AMN T41I missense mutation29. Expression of the AMN T41I mutant and cubilin in E. coli yields a stable complex (Supplementary Figure 5). Hence, receptor malfunction caused by T41I is not clearly explained by the structural changes alone. As described above, AMN contains two consensus sequences for potential N-linked glycosylation. The T41I mutation alters site II and consequently inhibits the transfer of oligosaccharides by oligosaccharyltransferases in the ER to Asn3930 (Fig. 5a). In order to investigate the functional significance of the two potential glycosylation sites, we performed flowcytometry and immunoprecipitation experiments on various AMN site I and site II mutants co-transfected with cubilin in CHO cells (Fig. 5b, c).

Fig. 5 N-linked glycosylation of AMN. a Consensus sequence motif for potential N-linked glycosylation of AMN. The two consensus sites are marked by lines above the sequence. Potentially glycosylated Asn residues are marked by asterisks. Residues selected for site-directed mutagenesis are marked in bold. b Surface expression of cubilin in transiently transfected CHO K1 cells. Cells were co-transfected with cubilin and AMN wild type or AMN mutants. Live cells were gated in SSC-A and FSC-A and replotted in a contour plot showing Cubilin expression vs. FSC-A. Correct gate for Cubilin expressing cells (Cubilin+) was set based on non-cubilin expressing control cells (mock transfected). Full gating strategy is shown in Supplementary Figure 3A. Relative surface expression of cubilin is shown in the bottom right corner. As cubilin and AMN expression was established by transient transfection only 10–20% of analysed cells co-expressed both proteins (Supplementary Figure 3B). The relative surface expression of cubilin was therefore calculated as % of cubilin+ cells (surface expression)/% cubilin+, AMN+ cells (total cell stain), experiment was performed in triplicates and error bars represent standard error of mean. c Immunoprecipitation of cubilin with wild type or mutant forms of AMN-V5. The top blot was visualized using rabbit polyclonal anti-rat cubilin antibody followed by Horse-radish peroxidase conjugated goat polyclonal anti-rabbit IgG. The bottom blot was visualized using mouse monoclonal anti-V5 alkaline phosphatase (AP) conjugated antibody. Uncropped blots are shown in Supplementary Figure 4 Full size image

AMN N35Q and S37A mutations both disrupt N-glycosylation site I, however, their effect on cubam surface expression is markedly different. Whereas N35Q completely abolishes surface expression, the AMN S37A mutant behaves as wild type. This indicates that Asn35 is not glycosylated, but instead important for intra-molecular interactions as also suggested from its position in the structure of AMN (Fig. 1d). The AMN T41I mutation that disrupts site II causes a significant reduction of surface expression (Fig. 5b) explaining why this mutation causes IGS. Interestingly, when inhibiting both N-glycosylation site I and II using the S37A/T41I double mutant, surface expression is restored (Fig. 5b). This suggest an interplay between the two adjacent glycosylation sites, which is supported by migration of AMN in SDS–PAGE (Fig. 5c). Here, neither of the individual S37A and T41I mutations causes a reduction in the apparent molecular size, whereas only the S37A/T41I double mutation migrates similar to PNGaseF treated AMN. Altogether, this indicates that the T41I mutation causes an aberrant glycosylation pattern of AMN ultimately leading to reduced surface expression of cubam.

The AMN IGS mutations L59P27, M69K31, C234F32 and G254E27 are all retained in the ER when co-expressed with cubilin in HEK293 cells16, which explains why these mutations impair cubam receptor function and cause IGS. Introducing the individual mutations in our E. coli expression system does not yield any soluble AMN or AMN–cubilin complex (Supplementary Figure 5). Since the folding of AMN and interaction with cubilin are mutually dependent on each other, we cannot decipher the cause of receptor malfunction from these experiments. However, we can predict the consequences of the individual mutations from the structure of AMN.

The AMN mutations L59P and M69K, are both located in the β-helix 1 domain responsible for the interaction with cubilin (Fig. 6a). Leu59 is located in the β-helical core of the domain and mutation to proline alters the main chain conformation of the residue and introduces clashes with Ser149 (Fig. 6b). This probably leads to a general destabilization of the entire domain and disruption of the AMN–cubilin interface. Met69 is positioned directly in the AMN–cubilin interface where it engages in hydrophobic interactions (Fig. 6b). Mutation to a lysine residue in this position introduces an unfavourable positive charge in the hydrophobic interface that most likely prevents the AMN–cubilin association.

Fig. 6 Missense mutations of AMN causing Imerslund–Gräsbeck syndrome. a AMN missense mutations causing Imerslund–Gräsbeck syndrome are marked by green spheres on a cartoon representation of the ectoAMN–cubilin structure. b–d Close-up views of AMN Leu59, Met69 and Gly254. The mutations are modelled using the rotamers with least clashes and shown as green sticks. Hydrogen bonds and ionic interactions are marked by dashed black lines Full size image

The two AMN missense mutations C234F and G254E are located in regions further from the cubilin-binding site (Fig. 6a). The C234F mutation disrupts a disulphide bond in the cysteine-rich region and leads to a free cysteine residue, which can engage in unspecific disulphide bonds and cause misfolding of AMN. The G254E mutation is located in the SEA domain. Substitution of Gly254 with a glutamate residue will introduce clashes and position the charged glutamate residue in a hydrophobic environment (Fig. 6d) that will probably destabilise the entire domain and result in misfolded AMN.