What's Phylo all about?

The comparison of the genomes from various species is one of the most fundamental and powerful technique in molecular Biology. It helps us to decipher our DNA and identify new genes. Though it may appear to be just a game, Phylo is actually a framework for harnessing the computing power of mankind to solve the Multiple Sequence Alignment problem.

What is a Multiple Sequence Alignment?

A sequence alignment is a way of arranging the sequences of DNA, RNA or protein to identify regions of similarity. These similarities may be consequences of functional, structural, or evolutionary relationships between the sequences. From such an alignment, biologists may infer shared evolutionary origins, identify functionally important sites, and illustrate mutation events. More importantly, biologists can trace the source of certain genetic diseases.

The Problem

Traditionally, multiple sequence alignment algorithms use computationally complex heuristics to align the sequences. Unfortunately, the use of heuristics do not guarantee global optimization as it would be prohibitively computationally expensive to achieve an optimal alignment. This is due in part to the sheer size of the genome, which consists of roughly three billion base pairs, and the increasing computational complexity resulting from each additional sequence in an alignment.

Our Approach

Humans have evolved to recognize patterns and solve visual problems efficiently. By abstracting multiple sequence alignment to manipulating patterns consisting of coloured shapes, we have adapted the problem to benefit from human capabilities. By taking data which has already been aligned by a heuristic algorithm, we allow the user to optimize where the algorithm may have failed.

The Data

All alignments were generously made available through UCSC Genome Browser. In fact, all alignments contain sections of human DNA which have been speculated to be linked to various genetic disorders, such as breast cancer. Every alignment is received, analyzed, and stored in a database, where it will eventually be re-introduced back into the global alignment as an optimization.