(Image obtained from https://whyfiles.org/2015/eight-ways-microbes-keep-you-healthy/index.html)

Infectious disease outbreaks are frequent and impactful; many reoccur year after year (e.g., flu, measles, and malaria). The genetic information of the causal pathogen can provide important information on how to respond to the outbreak in an effective and timely manner. But what if the pathogen is novel, or has some genetic variation that hasn’t been seen before? Several outbreaks of emerging infectious diseases (EIDs) have occurred in recent years, including those that are truly novel (e.g., SARS, MERS, H7N9) or newly emergent in a particular location (e.g., Ebola, Zika).

Pathogens naturally evolve over time, the result of genetic changes that produce variants referred to as ‘strains’ and ‘isolates’ within a species. Occasionally, pathogens evolve sufficiently far from their ancestor that a new species arises. The evolution of new strains, isolates or species of pathogens is important to monitor for the detection and management of disease outbreaks. Yet there is another cause of genetic variation that needs to be considered when addressing EIDs — microorganisms that have been purposefully engineered and subsequently released either intentionally (i.e., the release of a biological weapon) or accidentally (e.g., laboratory spill).

What is it? Where did it come from? Was it genetically engineered? If so, what was the pathogen designed to do? These are some of the questions that are important to answer when confronted with an unknown microbiological sample, whether from an infected individual, from a laboratory collection, or an environmental source. The answers to these questions and others will empower decision makers to take the actions necessary to protect the population from a harmful biological event, including a possible biological weapon attack.

Advanced Genomics

Genomics — the ability to read, write, and edit DNA — has been integral to both academic and industrial life sciences for decades. In the last decade, new genomics tools that are collectively called “synthetic biology” (including CRISPR/Cas and so called ‘living circuits’) have enabled new possibilities for creating and/or modifying microorganisms that have never been seen before. Unfortunately, genomics also can be used to intentionally alter the genetic code of a pathogen to make it more dangerous and resistant to treatment.

Rapidly isolating and identifying an unknown pathogen from a new sample can be challenging. However, timely answers are especially important to mitigate the impact of an attack or inadvertent release. But why is this so challenging? What technologies could be a potential solution?

Artistic rendering of CRISPR in action. (Picture credit: Stephen Dixon and Feng Zhang via Wired)

Project GEMstone

To answer these questions, we spoke with practitioners and scientists experienced in this area to identify the challenges associated with rapid sample interrogation and uncover some compelling technical capabilities. Their expertise spanned the areas of genomics, synthetic biology, bioinformatics, microbial forensics, microbiology, and biological defense. We learned that detecting evidence of intentional engineering in a pathogen, using current methods and resources, may necessitate more time (several weeks) than would be required in responding to the earliest phase of an outbreak. In addition, as synthetic biology techniques become more widely used, detecting clear “signals” of engineering will become more difficult because these methods will leave fewer traces behind. Other challenges we learned about include the highly variable nature of biological samples; the variable and sometimes unknown quality of data within public databases; the increasing number of privately held (and therefore less accessible) databases; custom built, non-transferable bioinformatic tools, many of which cannot handle large datasets; and the complexity of genetic engineering approaches.

On a positive note, our colleagues identified potential solutions that have been used in other sectors, but that haven’t yet been applied to biological defense applications. Our experiences highlight the importance of engaging with technologists and scientists across sectors to find the innovation solutions that can be leveraged for new applications. Read more about these findings in our roundtable report, published on the B.Next website.

Moving Forward

Two opportunities stood out to us while exploring this space, which we have decided to pursue:

(1) Most new genome sequences are identified through comparison with sequences that have already been discovered. However, an increasingly large proportion of sequence data is held privately and is not easily accessed by the biological defense community. Private holders of microbial genome sequences are often unwilling to share what they regard as proprietary data, or fear that the data may be used to impose undesired regulatory oversight. Our colleagues expressed a desire for new tools that would permit database queries that protect both the data owner and the query owner from mutual exposure. Luckily, other sectors have developed and utilized private information retrieval techniques to accomplish this type of ‘query-in-place’ scenario. We have kicked off a project to explore the potential use of such methods for genomic matching.

(2) As noted above, detecting evidence of genetic engineering with high confidence now takes time. However, it is possible to develop some “triaging” tools to speed the process and avoid slow and costly laboratory work on low-risk samples. We are beginning to explore whether machine learning techniques could be a valuable part of a “triage toolbox” for flagging genomic sequences with a high probability of being genetically engineered.

We will be posting more information and results on both of these projects in the near future. So, stay tuned!