CHAMPAIGN, Ill. -- Scientists have sequenced the genomes of nearly 6,900 organisms, but they know the functions of only about half of the protein-coding genes thus far discovered. Now a multidisciplinary effort involving 15 scientists from three institutions has begun chipping away at this mystery - in a big way. Their work to identify the function of one bacterial protein and the biochemical pathway in which it operates will also help identify the functions of hundreds of other proteins.

A report of their new approach and findings appears in the journal Nature.

The research team used computational methods combined with a broad array of laboratory techniques to narrow the list of possible small molecules that interact with the unknown protein, an enzyme (now known as HpbD), and to identify its role in its host, the marine bacterium Pelagibaca bermudensis.

The goal was not simply to identify the protein's function but to forge a new way to tackle the vast and growing body of sequence data for which functional information is lacking, said University of Illinois biochemistry professor John Gerlt, one of five co-principal investigators on the study.

"At present, the number of proteins in the protein-sequence database is approaching 42 million," Gerlt said. "But no more than 50 percent of these proteins have reliable functions assigned to them."

Without knowing what all of the proteins that are encoded by a genome do, "one simply cannot understand the biology of the organism," Gerlt said.

The new effort is part of the Enzyme Function Initiative (EFI) at the Institute for Genomic Biology at Illinois. This initiative, funded by the National Institutes of General Medical Sciences and led by Gerlt, is designed to address "complex problems that are of central importance to biomedical science but are beyond the means of any one research group." The EFI focuses on enzymes of bacterial origin.

"There was a time when I would apologize that we were focusing on bacterial genomes and not human genomes," Gerlt said. "However, it is now well established that we do not live in isolation, that we have a microbiome associated with us and that microbiome is made up of thousands of different bacterial species that inhabit our bodies. It is very important for us to understand what these bacteria are capable of doing."

Matthew Jacobson and postdoctoral researcher Suwen Zhao at the University of California, San Francisco led the computational effort that was at the heart of streamlining the process of protein discovery for the group. Their method pairs an enzyme with tens of thousands of possible metabolic partners to see which molecules fit together best. Since enzymes act on other molecules to perform a specific function, identifying an enzyme's target (also called its substrate) offers a big clue to the enzyme's activity.

This process led to the identification of four possible substrates (out of an original list of more than 87,000). Zhao passed the identities of these four substrates and a likely pathway in which the enzyme operated along to Gerlt and his colleagues (microbiology professor John Cronan and chemistry professor Jonathan Sweedler, both at Illinois, and Steven Almo at the Albert Einstein College of Medicine). Then the painstaking laboratory work began.

Several lines of research helped identify which of the four substrates actually interact with the enzyme, confirmed the function of the enzyme and the chemical pathway in which it operates.

The researchers discovered that their enzyme catalyzes the first step in a biochemical pathway that enables the marine bacterium to consume one of the substrates identified in Jacobson's lab. The bacterium uses the substrate, known as tHypB (tee-hype-bee), as a carbon source.

More importantly, the team discovered that tHypB has another, perhaps more important, role in the bacterium: It helps the organism deal with the stress of life in a salty environment, Gerlt said.

This effort to understand the function of one enzyme offers a cascade of other benefits, Gerlt said. One big advantage of this approach is that it aids in the identification of orthologs (enzymes that perform the same task in other organisms).

"There are dozens of orthologs in the protein database that were identified by Patricia Babbitt and her colleagues at UCSF, so we determined not only the function of one but we also determined the functions of all these enzymes," he said. And because the researchers also identified the functions of all the enzymes in the pathway that allows the microbe to consume tHypB, their work offers insight into the role of orthologous enzymes in similar pathways in other organisms.

Researchers with the EFI are working to develop strategies and tools that other researchers can use to accomplish similar feats of discovery.

"There was a time when a researcher devoted his or her entire career to a single enzyme," Gerlt said. "That was a long time ago, although some people still practice that. Now, genome-sequencing technology has changed the way that biologists have to look at problems. We can't keep looking at problems in isolation."

###

Editor's notes: To reach John Gerlt, email j-gerlt@illinois.edu.

Steven Almo, email steve.almo@einstein.yu.edu.

Patricia Babbitt, email Babbitt@cgl.ucsf.edu.

John Cronan, email jcronan@life.uiuc.edu.

Matthew Jacobson, email Matt.Jacobson@ucsf.edu.

Jonathan Sweedler, email jsweedle@illinois.edu.

The paper, "Discovery of New Enzymes and Metabolic Pathways Using Structure and Genome Context," is available to members of the media from the U. of I. News Bureau.