In the primordial soup that was early Earth, life started small. Elements joined to form the simple carbon-based molecules that were the precursors of everything that was to come. But there is debate about the next step.

One popular hypothesis suggests that ribonucleic acid (RNA) molecules, which contain the genetic blueprints for proteins and can perform simple chemical reactions, kick-started life. Some scientists refute this idea, however, saying RNA is too large and complex a molecule to have started it all. That group says simpler molecules had to evolve the ability to perform metabolic functions before macromolecules such as RNA could be built. This idea is appropriately named "metabolism-first," and new evidence out of the University of Illinois backs it up.

"All living organisms have a metabolism, a set of life-sustaining chemical transformations that provide the energy and matter needed for the functions of the cell. These metabolic transformations are assumed to have occurred very early in life, in primitive Earth. Organisms probably replaced chemical reactions already going on in the planet and internalized them into cells through development of enzymatic activities," says Gustavo Caetano-Anollés, bioinformatician and professor in the Department of Crop Sciences at U of I.

Caetano-Anollés and Ibrahim Koç, a visiting scholar in the department, found evidence for the "metabolism-first" hypothesis by studying the evolution of molecular functions in organisms representing all realms of life. For 249 organisms, their genomes -- or complete set of genes -- were available in a searchable database. What's unique about this particular resource, known as the Gene Ontology (GO) database, is the fact that for each gene product -- a protein or RNA molecule -- a set of terms describing its function goes with it.

"You can take an entire genome that represents an organism, like the human genome, and visualize it through the collection of functionalities of its genes. The study of these 'functionomes' tells us what genes do, instead of focusing on their names and locations. For example, we can find out what kinds of catalytic, recognition, or binding activities a gene product has, which is much more intuitive," Caetano-Anollés notes. "The best way to understand an organism is through its functions."

According to Caetano-Anollés, the number of times a function appears in a genome provides historical information. So the team took the GO terms describing all of the molecular functions in each organism and counted them up. The idea was that an ancient function, such as the catalytic activity of metabolism, is likely shared by all organisms and will be found in large numbers. On the other hand, more recent functions are found in lower numbers and in smaller subsets of organisms.

The team used the information and advanced computational methods to construct a tree that traced the most likely evolutionary path of molecular functions through time. At the base of the tree, close to its roots, were the most ancient functions. The most recent were close to the crown.

At the base of the tree, corresponding to the origin of life on Earth, were functions related to metabolism and binding. "It is logical that these two functions started very early because molecules first needed to generate energy through metabolism and had to interact with other molecules through binding," Caetano-Anollés explains.

The next major advancements were functions that made the rise of macromolecules possible, which is when RNA might have entered the picture. Next came the machinery that integrated molecules into cells, followed by the rise of functions allowing communication between cells and their environments. "Finally, as you move toward the crown of the tree, you start seeing functions related to highly sophisticated processes involving things like muscle, skin, or the nervous system," Caetano-Anolles says.

The research doesn't just shed light on the past. Knowing the progression of these molecular functions through time can help predict where life on Earth is headed. "People think of evolution as looking backwards," Caetano-Anollés says. "But we could use our chronologies and methodologies to ask what novel molecular functions will be generated in the future."

The work has applications for bioengineering, an emerging field that uses biological information and computation to produce novel molecules. Engineered molecules could combat disease and improve the quality of everyday life, according to Caetano-Anollés. "The best way to reengineer biological molecules with novel and useful molecular functions is to learn principles from clues left behind in their past," he says.

The article, "The natural history of molecular functions inferred from an extensive phylogenomic analysis of gene ontology data," is published in PLoS One.