Because the genome contains such a wealth of information, and in a language which we don’t quite understand, it is of utmost importance to organize it in meaningful ways. As there is no precedent for this syntax, recognizing and documenting adequate patterns is key. Currently there are approximately five conventional methods of visualizing genomic data.

a) Tracks: The bane of academics and government researchers everywhere, sequences represented as rows. Track browsers can display multiple dimensions of the same set of data as well as comparative sets. UCSC Genome Browser is the standard example for this method. However, their usefulness is often limited to close-level, specific targets.

b) Heat Maps: These are likely what the layperson imagines when they think of genomic visualization. Normally encompassing a rectangle containing multi-colored blocks in rows and columns. This method is particularly prevalent amongst microarray data. IGV, the Integrative Genome Viewer can generate heat maps quite well, so can some all-purpose tools such as R Statistics and Gnuplot.

c) Circular Genome Maps: A method that attempts to combine the robustness of the track maps in a less overwhelming take. Here strips of data are aligned in concentric circles, making it easier to see possible correlations. Circular maps have shown to be especially helpful recently in showcasing drug/gene interactions. Genome Projector by Arakawa et al. can generate great circular genome maps, amongst other methods.

d) DNA Walks: Are a relatively new method which represent genomic data vectors in a two dimensional plane. Where each letter (A,T,G,C) denotes a direction (up, down, left, right). This method has been helpful in generating a single unique image representative of the sequence in question, and is particularly adapt at showcasing small changes in structural contents, i.e. GC rich regions or poly A tails.

e) Network Maps: Originally used to help understand computer networks, this method has quickly proven valuable to systems biology. In viewing the genome, pathways of interactions that were once obscure are allowed to move to the foreground, as well as seeing inherent divisions in function within the genome. The go-to software at the moment is Cytoscape, although Ayasdi is showing to be a wonderful competitor.

Life science researchers, clinicians, and software engineers all stand to benefit from and are required, for humanity to get a useful grasp on this powerful language. These visualization techniques are, as mentioned, tried and tested. We must learn from, evolve, and iterate them into new tools which strike a balance on imposing our own will on the data and showcasing its inherent, underlying structures.

See:

Previous post “Chaos Game Analysis of Genomes” & work by GeneDrop.

37.774631 -122.425082