Partner fidelity of the two UCYN-A lineages

Microscopic in situ identification of different UCYN-A lineages as well as their prymnesiophyte partners by specific CARD-FISH staining is crucial to determine the specificity of their relationships. The CARD-FISH method has been successfully applied to identify unicellular diazotrophic cyanobacteria16 as well as specifically targeting the UCYN-A clade15,17. However, to our knowledge there was not any reported probe to distinguish UCYN-A at the lineage level. We designed a competitor probe to be used with the UCYN-A732 probe15 to distinguish UCYN-A1 and UCYN-A2 lineages (Fig. 1a–c; Supplementary Table 1). Similarly, we designed two probes to distinguish the two prymnesiophyte partners, B. bigelowii (UBRADO69 probe) and the closely related prymnesiophyte (UPRYM69 probe) (Fig. 1a–c; Supplementary Table 1). The UCYN-A732 probe, in the absence of its competitor, labelled UCYN-A cells inside either B. bigelowii or the closely related prymnesiophyte partner (Fig. 1a,c). However, when the UBRADO69 probe was applied with the UCYN-A732 probe together with its competitor, UCYN-A cells were unlabelled or labelled when accompanying B. bigelowii or the closely related prymnesiophyte partner, respectively (Fig. 1b). It has been proposed that smaller UCYN-A cells are associated with smaller prymnesiophyte cells and vice versa, indicating different growth stages17. However, those findings were interpreted from microscopic observations of the UCYN-A symbiosis detected with the general prymnesiophyte PRYM02 and UCYN-A732 (without its competitor) probes, that is, without the ability to distinguish UCYN-A1 and UCYN-A2 cells. The results presented here show that both prymnesiophyte partners are phylogenetically closely related but distinct species, and therefore we suggest that the observed differences in cell sizes of prymnesiophyte partners reflect distinct species rather than different growth stages of the same species. These results demonstrate that UCYN-A lineages display partner fidelity with their prymnesiophyte partners, being B. bigelowii and the closely related prymnesiophyte in specific association with UCYN-A2 and UCYN-A1 lineages, respectively.

Figure 1: Partner specificity and variation of UCYN-A lineages with plankton size fraction. (a–c) Epifluorescence microscopy images with the double-CARD-FISH assay showing the specificity of symbiont–host pairs and (d–f) fragment recruitment of UCYN-A lineages in size-fractionated metagenomes from surface waters collected in station TARA_078. (a–c) Left panels correspond to the 4′,6-diamidino-2-phenylindole signal (blue-labelled DNA); right panels correspond to the combined signal of the prymnesiophyte-specific probes (green-labelled host under blue light excitation) and the UCYN-A probe (red-labelled symbiont under green-light excitation). (a) UCYN-A1 with its prymnesiophyte partner; (b) the two UCYN-A symbiotic pairs, indicating the specific labelling of UCYN-A1 (upper) and B. bigelowii (lower) with their specific partners, the small prymnesiophyte closely related to B. bigelowii and UCYN-A2 respectively; (c) B. bigelowii with UCYN-A2. The inset in c shows the detail of non-associated UCYN-A2 cells within a common symbiotic structure. Prymnesiophyte partners are indicated by arrow heads. Scale bar in a represents 5 μm and this scale is shared in a–c except in the inset of c where it indicates 2 μm. (d–f) On the left side, recruitment of metagenomic reads using UCYN-A1 and UCYN-A2 genomes as reference. Reads are plotted as red (UCYN-A1) or blue (UCYN-A2) dots depending on the closest hit genome, representing the covered genome positions (x axis) and the % of identity with the closest reference (y axis). A horizontal grey line set at 95% indicates the threshold for reads representing members of the same population as the reference genome. On the right side, histograms represent the number of recruited reads, in logarithmic scale, by UCYN-A1 (red) or UCYN-A2 (blue) genomes in intervals of 1% identity, from 100 to 70% identity. Full size image

The number of UCYN-A cells per partner is lineage specific

Previous studies have shown that the prymnesiophyte partners can harbour one or two UCYN-A cells4,9,13,15, pointing to a coupling between the prymnesiophyte cell division and the number of symbiotic cells, at least for UCYN-A1 (ref. 9). In our samples, only one UCYN-A1 cell per prymnesiophyte cell was detected (Fig. 1a,b). By contrast, B. bigelowii carried a symbiosome-like compartment with a variable but higher number of UCYN-A2 cells (∼3–10 cells) (Fig. 1b,c). This structure was observed both attached to the host and in a free state, as an entity composed by several UCYN-A2 cells enclosed by a common envelope (Fig. 1c). In a previous study, the UCYN-A2 cells found in B. bigelowii were separated from the B. bigelowii cytoplasm by a single membrane, likely a perisymbiont membrane, and the envelope of the UCYN-A2 itself consisted of three layers, possibly an outer membrane, a peptidoglycan wall and a plasma membrane13. Although UCYN-A1 and UCYN-A2 are very similar in terms of gene content, the genes involved in cell wall biogenesis and cell shape determination appear to be only present in UCYN-A2, suggesting clear structural differences associated with its host12. Therefore, our observations hint at different symbiotic organizations: while the UCYN-A1 lineage has one or two separated cells per host, the UCYN-A2 lineage may harbour up to 10 cells per prymnesiophyte partner cell within a common symbiotic structure.

UCYN-A lineages vary in different plankton size fractions

A total of eight marine metagenomes from stations TARA_078 and TARA_076 were analysed to assess the distribution of UCYN-A lineages in several plankton size fractions (0.2–3, 0.8–5, 5–20 and >0.8 μm) of the microbial assemblages in surface and DCM waters (Table 1). We used the two UCYN-A genomes sequenced to date as reference genomes11,12 in the fragment recruitment of these metagenomic samples (Table 1). Because of the UCYN-A partner fidelity displayed by double CARD-FISH (see above), metagenomic sequence reads from UCYN-A lineages should vary with size fraction as predicted by the different cell sizes of the prymnesiophyte partners. The sequence reads from the UCYN-A1 lineage were primarily present in surface waters within the size fraction range of the small prymnesiophyte partner (0.2–3, 0.8–5 and >0.8 μm; Table 1). Almost 100% of the UCYN-A1 genome was recovered in each of the metagenomes from surface of these size fractions in the two stations. Likewise, UCYN-A1 sequence reads were poorly represented in the 5–20 μm size fraction (∼10% of genome recovery; Fig. 1d–f; Table 1). On the other hand, in TARA_078, the UCYN-A2 sequence read distribution in surface waters was consistent with the B. bigelowii cell size, that is, UCYN-A2 reads were nearly absent in the 0.2–3 μm size fraction metagenomes, but were more abundant in the 0.8–5, 5–20 and >0.8 μm fractions. In all these larger fractions, the UCYN-A2 reached high genome recovery values (90%, 76% and 99%, respectively), except for the >0.8 μm fraction in TARA_076 where UCYN-A2 was virtually absent (Fig. 1d–f, Table 1). In the >0.8 μm size fraction, UCYN-A1 was approximately nine times more abundant than UCYN-A2 in TARA_078 (Table 1). Likewise, in the same station, the small prymnesiophyte partner was more abundant than B. bigelowii based on V9 18S ribosomal RNA (rRNA) tags9. In the DCM samples, both UCYN-A lineages were poorly represented in the metagenome sequences, accounting for <14% and 1% of genome recovery for UCYN-A1 and UCYN-A2, respectively (Table 1). The same vertical distribution has been observed for their prymnesiophyte partners that were found preferentially in surface layers, while the rest of the prymnesiophyte assemblage peaked at the DCM9. Therefore, although the UCYN-A1 lineage was in general more abundant than UCYN-A2, a transition from the UCYN-A1 to UCYN-A2 lineage was observed from smaller to larger size fractions, likely explained by the partner fidelity and the difference in cell size of their prymnesiophyte partners.

Table 1 Fragment recruitment (FR) of UCYN-A lineages. Full size table

Another interesting finding was that most of the metagenomic (and metatranscriptomic) reads mapping to the UCYN-A1 or UCYN-A2 genomes had very high sequence identities (>99% to their respective reference genome; Fig. 1d–f), which suggests an extremely low microdiversity within populations that were sampled from geographically distant regions in the Pacific (ALOHA and SIO) and South Atlantic Oceans (this study). The size-fractionated sampling strategy combined with the metagenomic analyses reported in this study will be also important to uncover the genomic pool of new UCYN-A lineages, such as UCYN-A3, to identify the lineage-specific distribution of UCYN-A populations and to set the cell size range of their partners, a first step for their identification.

UCYN-A expression is streamlined to fuel nitrogen fixation

The analyses of seven size-fractionated metatranscriptomes from two stations (TARA_078 and TARA_076) and depths (surface and DCM) allowed for the first time a whole-genome transcription profiling of these widely distributed diazotrophic cyanobacteria (Table 1). In surface waters, UCYN-A1 transcripts were in general more abundant than those from UCYN-A2, except in the 5–20 μm size fraction (TARA_078) in which the latter were dominant (Table 1). The gene expression of 1,131 and 1,179 protein-coding genes in UCYN-A1 (Supplementary Data 1) and UCYN-A2 (Supplementary Data 2), respectively, were examined. In both lineages, the nitrogen fixation operon, including the nifH gene, was the most highly expressed gene-cluster accounting for a quarter of the total transcripts (Fig. 2a,b). In the >0.8 μm size fraction (TARA_078), despite UCYN-A1 being more abundant than UCYN-A2, the expressed nifH transcripts per cell were almost two times higher for UCYN-A2 (648.33) than for UCYN-A1 (396.60; Supplementary Data 1 and 2). It is well known that biological nitrogen fixation has a high energetic cost (16 mol of ATP to generate 2 mol of ammonia). Notably, the F0F1-ATP synthase operon and genes encoding for the cytochrome b 6 f complex and photosystem I complex (PSI) were highly transcribed and positively correlated (P<10−5, N=6, linear regression analysis) with the nitrogen fixation operon transcript abundances (Fig. 2c). These findings suggest that the generation of reducing power and the ATP synthesis could be coupled to fuel the nitrogen fixation process in UCYN-A. Likewise, UCYN-A2 might have higher nitrogen fixation rates per cell than UCYN-A1 based on the higher number of nifH transcripts per cell. It is reasonable to assume that the differences in nifH gene expression between the UCYN-A lineages could simply reflect the differences in the cell size of their partners with differential nutrient requirements for growth. In addition, it has been indirectly demonstrated that the nitrogen fixation of UCYN-A supports the CO 2 fixation of its prymnesiophyte partner18. Therefore, we hypothesize that the larger B. bigelowii host cell would meet its larger N nutrient requirements by partnering with a larger number of UCYN-A2 symbiotic cells.

Figure 2: Genome expression in UCYN-A1 and UCYN-A2 lineages. (a) Metatranscriptome recruitment at the surface of the TARA_078 station of UCYN-A1 (0.2–3 μm) and UCYN-A2 (5–20 μm) transcripts. Transcripts are plotted as black dots representing the covered genome positions and the % of identity with the closest reference. A horizontal grey line set at 95% identity shows the threshold used to count the number of times, or coverage, that a gene was expressed. The most expressed genes in both lineages are highlighted. (b) Relative contribution of nitrogen fixation operon, FOF1-ATP synthase operon, cytochrome b 6 f and PSI genes to the total UCYN-A transcripts contribution in surface samples; percentages are indicated. (c) Transcript counts of nitrogen fixation operon versus those of ATP synthase (triangle), cytochrome b 6 f (square) and PSI (open circle) transcripts. All of these transcripts were significantly correlated (P<10−5) and regression lines, regression equations and R2 values are indicated in the figure. Full size image

Nitrogen-fixing microorganisms, and particularly cyanobacteria, should protect their nitrogenase from inactivation by oxygen. The absence of the ability to use photosystem II that evolves O 2 explains why UCYN-A appears to fix N 2 and express the nitrogenase genes during the day19. However, its association with an oxygen-evolving partner could make the nitrogenase enzyme in UCYN-A not completely safe from oxygen. We observed that the sufB gene (cysteine desulferase), involved in the assembly or repair of oxygen-labile iron–sulfur clusters under oxidative stress, was highly transcribed (Supplementary Data 1 and 2). It may be that UCYN-A requires high expression level of sufB genes to repair the nitrogenase enzyme from oxygenic inactivation, suggesting then a similar role than for the peroxidase genes found in their genomes11,12. Our findings reveal that UCYN-A lineages dedicate a large transcriptional investment to fix nitrogen representing the first whole-genome expression profiling in environmental UCYN-A populations.

UCYN-A diverged during the late Cretaceous

Our findings on partner fidelity in UCYN-A point to the hypothesis of symbiont–host co-evolution14. To analyse the selection pressure and evolution of the protein-coding genes, we calculated the number of synonymous or silent (Ks) and non-synonymous (Ka, inducing amino-acid change) nucleotide substitutions20,21 for 887 protein-coding genes shared by the UCYN-A1 and UCYN-A2 genomes (Supplementary Data 3). The Ka/Ks ratio may offer important clues about the selection pressure where ratios <1 indicate purifying selection and ratios >1 point to positive selection22. We found that 873 out of the 887 protein-coding genes were under purifying selection (P<0.05, codon-based Z-test) (Supplementary Data 3). The 14 remaining genes also presented Ka/Ks<1 but were not statistically well supported (P>0.05). Purifying selection means that synonymous mutations are maintained, while non-synonymous mutations are continuously removed from the population. We did not detect signs of large-scale positive selection, that is, no apparent strong adaptation to novel niches in UCYN-A lineages, suggesting that the evolutionary forces for niche adaptation would act on the prymnesiophyte partners rather than on UCYN-A. Our results are consistent with the fact that UCYN-A2 lacks the same major pathways and proteins that are absent in UCYN-A1 (ref. 12), indicating then that the symbionts were genetically adapted to their hosts before they were separated by speciation.

The age of divergence for UCYN-A1 and UCYN-A2 lineages was calculated by phylogenomic and Bayesian relaxed molecular clock analyses (Fig. 3; Supplementary Table 2). Our results indicate that UCYN-A1 and UCYN-A2 lineages diverged around 91 Myr ago, that is, during the late Cretaceous. In agreement, B. bigelowii has a fossil record extending back to the late Cretaceous (ca. 100 Myr ago)23, reported from neritic and pelagic sediments, for example, in lower Paleogene sediments immediately above the K/Pg mass extinction level as well as in the Oligocene Diversity Minimum24,25. In the Jurassic, between 190 and 100 Myr ago, nutrient availability in the ocean was lower than at any point during the last 550 Myr ago26. It is therefore likely that the symbiotic relationship between the common ancestor of UCYN-A1 and UCYN-A2, and a Braarudosphaera-related species was established by the late Cretaceous to cope with extremely low-nutrient conditions and a generalized oligotrophy in marine surface waters, as it has been recognized for other symbiotic system such as the Acantharia–Phaeocystis symbiosis27. UCYN-A then underwent purifying selection, progressively reducing its genome to the point that it became an obligate symbiont. An analogous discovery was the case of the two Rhopalodiaceae freshwater diatom species, Rhopalodia gibba and Epithemia turgida having acquired N 2 -fixing endosymbionts28,29. Similar to the two UCYN-A partnerships described here, phylogenies of these two diatoms species and their intracellular symbionts were found to be congruent and, concordantly, a single symbiotic event has been proposed29. Probably, a similar scenario can be envisioned here for the two UCYN-A partnerships.

Figure 3: Time-calibrated cyanobacteria tree. The phylogeny shown was estimated based on 135 proteins from 57 taxa. Three calibration points (black circles) were used for the tree presented and were treated as soft bounds. The root of the tree was set with a maximum age of 2,700 Myr ago and a minimum age of 2,320 Myr ago. Divergence time for the ancestor of cyanobacteria UCYN-A1 and UCYN-A2 (highlighted with a grey box) is given with the corresponding values for the posterior 95% confidence intervals in Supplementary Table 2. Full size image

Taking into account that the number of symbiotic cells harboured by distinct prymnesiophyte partners is different and phylogenetically dependent, that is, the larger B. bigelowii can harbour a variable number (up to 10) of UCYN-A2 cells, while the small prymnesiophyte partner harboured only one or two UCYN-A1 cells, it is reasonable to think that a larger nutrient acquisition could be linked to a larger number of symbionts. Indeed, the whole-genome expression patterns suggested that the metabolic investment in UCYN-A1 and UCYN-A2 is mainly focused on the nitrogen fixation machinery. Our evolutionary analysis revealed that UCYN-A1 and UCYN-A2 were genetically adapted to their prymnesiophyte partners before UCYN-A speciation (purifying selection) but, on the contrary, the prymnesiophyte partners seem to follow different ecological strategies9, suggesting a speciation process under positive selection. Our results suggest that the partner fidelity shown by UCYN-A lineages together with the speciation in the common ancestor of B. bigelowii and its closely related prymnesiophyte may have forced an allopatric speciation of UCYN-A1 and UCYN-A2 populations in the late Cretaceous. Comparative genome analysis of the two prymnesiophyte partners would clarify whether these two algal species underwent positive selection through evolution by adaptation to novel niches. As revealed by nifH phylogenetic analysis, it seems that novel UCYN-A lineages, such as UCYN-A3, and prymnesiophyte (or not prymnesiophyte) partners, will help to understand the evolutionary relationships of N 2 -fixing cyanobacterial symbionts and the extent of their ecological relevance on marine biogeochemical cycles.