The extent of prokaryotic LGT to eukaryotes is the question that Ku and Martin decided to tackle in their recent article in BMC Biology [8]. Their idea is that if LGT from prokaryotes to eukaryotes is continuous and prevalent, traces of recent LGT must be detectable in eukaryote genomes. To assess this, they have re-analyzed their 2015 dataset made up of ~2600 phylogenetic trees encompassing 55 eukaryotes from diverse lineages and ~2000 prokaryote species. While they identify many prokaryote to prokaryote LGT candidates with high similarity between donor and receiver genes, indicative of recent transfer, they found a paucity of highly similar prokaryote to eukaryote LGT candidates. Moreover, while in prokaryotes recent LGT candidates are present in multiple species in the receiver clade, this is much more rarely observed in eukaryotes. Furthermore, if the candidate eukaryotic acquisitions from plastid and mitochondrial ancestors are excluded from the analysis, there remain only a few species-specific recent candidate LGT events. Because these few recent candidates are specific to one or a few species and highly similar to their prokaryotic candidate donors, they cannot easily be distinguished from bacterial contamination. On the basis of these observations, the authors conclude that there is a lack of evidence for recent LGT of prokaryotic origin in eukaryotic genomes and that this phenomenon is neither continuous nor prevalent. They further propose that any protein-coding gene in a eukaryotic genome with ≥70 % identity to prokaryotic homologs should be first considered as likely contamination rather than candidate LGT.

Obviously, several confounding factors could also contribute to this paucity of candidate recent LGT in eukaryote genomes. First, in their dataset, the authors include almost 40 more bacterial species (including closely related species or different strains of the same species) than eukaryotic species (none of which are closely related). This can partly contribute to the paucity of recent candidate LGT conserved between multiple eukaryote species within a receiver clade. One could also argue that true candidate prokaryotic LGT donors have not been sampled because most are probably uncultured bacteria distant from anything that has been sequenced. Finally, the removal of everything highly similar to bacterial genes prior to eukaryotic genome annotation (or assembly) could also contribute to this deficiency of putative recent LGT. In many genome projects these highly similar sequences are considered as contaminants and are not visible in the final set of predicted protein-coding genes. However, as stated by the authors, these features probably account for only a minor part of the huge difference between prokaryote–prokaryote and prokaryote–eukaryote distribution of similarity between candidate donor and receiver genes. It is almost certain that the contribution of LGT of prokaryote origin to the making of a eukaryotic nuclear genome is several orders of magnitude less important than for prokaryotes.

What we can conclude from this recent paper and the tardigrade controversy is that any claim of prokaryote–eukaryote LGT (and particularly those with high identity to prokaryote candidate donors) must be taken with caution and, ideally, additional supporting evidence should be gathered. In addition to phylogenetic analysis, features such as presence of bona fide eukaryotic genes on the same contigs as the candidate LGT sequences, the presence of spliceosomal introns, conservation of the LGT candidate in sister species, and transcriptional support all provide additional evidence for LGT rather than contamination.