Cloud technology giving us a “CRISPR” view to help cure disease

By Dr Denis Bauer

Did you know that the genome is regulating almost all functions in our body? All of this is encoded in the 3 billion letters of our genome. These letters can get changed over a person’s lifespan or be different in families. Some of the changes can have no effect while others can be the cause of cancer or genetic diseases. Reversing these changes could help cure these genome-based diseases.

There’s new research in molecular engineering technology that allows scientists to edit the DNA with greater accuracy and precision than previous technologies. This new method, called CRISPR-Cas9, can be programmed to recognise and edit specific locations in the genome by pattern-matching unique sequences of DNA.

In early 2016, the US National Institutes of Health (NIH) approved the use of these technologies for human health. This means that we can now explore revolutionary ways of cancer treatments. Using this new technology, a cancer patient’s own immune system is boosted through specific modifications of the cells that natively fight cancer. This has the potential of being effective for a wide range of different tumors, with the current trial including patients with specific blood and solid cancers, as well as melanoma.

Genome engineering boosted by cloud technology

This new application in human health requires an increase in robustness and efficiency of CRISPR-Cas9 design in order to meet the time constraints of clinical care. To address this, we developed a software tool called GT-Scan2 that helps choose the optimal CRISPR target site. This is important because it helps avoid diluting the effect due to “off-targets”, which are other sites in the genome capable of diverting the CRISPR molecule. It also optimises robustness by finding sites that are easier to modify. Our software can identify genomic locations with higher sensitivity and specificity than other published methods.

Specifically the off-target search is a compute intensive task traditionally reserved for researchers at large institutes. High-performance-compute infrastructure is required as every location in the 3 billion letter long genomic sequence needs to be investigated. GT-Scan2 democratises the ability to find optimal sites by offering this complex computation as a cloud-service.

Scaling instantaneously for personalised treatments

We have been working with the Amazon Web Services (AWS) team to help us identify the ideal solution for our software. We’ve been using Lambda, an AWS service launched in 2016, which helps our software to quickly adjust compute resources to match the complexity of the analysis task. It processes billions of base pairs in its off-target search by subdividing the job into independent, modular tasks that can be run in parallel. A typical GT-Scan2 job takes less than a minute and thanks to Lambda we can keep the runtime constant irrespective of how complex the task.

What the future holds

While we are looking for opportunities to seeing GT-Scan2 applied to clinical applications, the team is also developing solutions to analyse whole genome profiles to identify novel disease genes. We utilise Big Data technology to push the boundary of how much data can be analysed. Our software currently classifies 3000 individuals with profiles each containing 80 Million DNA variants in under 30 minutes with the ability to scale to even larger datasets. This is important because common diseases such as diabetes are complex and need cohort sizes of whole populations to find better treatments.

You can find out more about our innovative health technologies here.