In a phylogenetic network analysis of the first 160 complete genomes of SARS-CoV-2 to be sequenced from human patients, an international team of scientists found three distinct variants of SARS-CoV-2: A, B and C: the A and C types are found in significant proportions outside East Asia, that is, in Europeans and Americans; in contrast, the B type is the most common type in East Asia, and its ancestral genome appears not to have spread outside East Asia without first mutating into derived B types, pointing to founder effects or immunological or environmental resistance against this type outside Asia.

“There are too many rapid mutations to neatly trace a SARS-CoV-2 family tree,” said lead author Dr. Peter Forster, a geneticist at the University of Cambridge.

“We used a mathematical network algorithm to visualize all the plausible trees simultaneously.”

“These techniques are mostly known for mapping the movements of prehistoric human populations through DNA. We think this is the first time they have been used to trace the infection routes of a coronavirus like SARS-CoV-2.”

Dr. Forster and colleagues used data from SARS-CoV-2 virus genomes sampled from across the world between December 24, 2019 and March 4, 2020.

They revealed three distinct variants of the virus, consisting of clusters of closely related lineages.

They found that the closest type of SARS-CoV-2 to the one discovered in bats — type A, the original human virus genome — was present in Wuhan, but surprisingly was not the city’s predominant virus type.

Mutated versions of A were seen in Americans reported to have lived in Wuhan, and a large number of A-type viruses were found in patients from the US and Australia.

Wuhan’s major virus type, B, was prevalent in patients from across East Asia. However, the variant didn’t travel much beyond the region without further mutations — implying a founder event in Wuhan, or resistance against this type outside East Asia.

The C variant is the major European type, found in early patients from France, Italy, Sweden and England. It is absent from the study’s Chinese mainland sample, but seen in Singapore, Hong Kong and South Korea.

The new analysis also suggests that one of the earliest introductions of the virus into Italy came via the first documented German infection on January 27, and that another early Italian infection route was related to a Singapore cluster.

Importantly, the team’s genetic networking techniques accurately traced established infection routes: the mutations and viral lineages joined the dots between known cases.

As such, the scientists argue that these phylogenetic methods could be applied to the very latest coronavirus genome sequencing to help predict future global hot spots of disease transmission and surge.

“Phylogenetic network analysis has the potential to help identify undocumented COVID-19 infection sources, which can then be quarantined to contain further spread of the disease worldwide,” Dr. Forster said.

The variant A, most closely related to the virus found in both bats and pangolins, is described as the root of the outbreak by the researchers.

B is derived from A, separated by two mutations, then C is in turn a daughter of B.

“The localization of the B variant to East Asia could result from a founder effect: a genetic bottleneck that occurs when, in the case of a virus, a new type is established from a small, isolated group of infections,” the authors said.

They argue that there is another explanation worth considering.

“The Wuhan B-type virus could be immunologically or environmentally adapted to a large section of the East Asian population,” Dr. Forster said.

“It may need to mutate to overcome resistance outside East Asia. We seem to see a slower mutation rate in East Asia than elsewhere, in this initial phase.”

“The viral network we have detailed is a snapshot of the early stages of an epidemic, before the evolutionary paths of COVID-19 become obscured by vast numbers of mutations. It’s like catching an incipient supernova in the act.”

The study was published in the Proceedings of the National Academy of Sciences.

_____

Peter Forster et al. Phylogenetic network analysis of SARS-CoV-2 genomes. PNAS, published online April 8, 2020; doi: 10.1073/pnas.2004999117