One often hears about the multitude of genes we have in common with chimps, birds or other living creatures, but such comparisons are sometimes misleading. The shared percentage usually refers only to genes that encode instructions for making proteins -- while overlooking regulatory genes, which nonetheless make up a large part of the genome. "Humans and fish, for instance, share about 70% of their protein-coding genes, but only about 0.5% of an important class of regulatory genes -- ones that give rise to so-called long non-coding RNAs, or lncRNAs," says Dr. Igor Ulitsky of the Biological Regulation Department at the Weizmann Institute of Science.

The lncRNAs (pronounced link-RNAs) until recently received much less attention than the protein-coding genes, but they are now proving to be of increasing interest to science. Not only are there as many as 20,000 lncRNA genes in the human genome -- about the same number as the protein-coding ones -- but the lncRNAs have lately been revealed to serve as master switches in a wide variety of biological processes. They turn genes on and off and affect other regulatory genes, controlling cellular fate during fetal development, as well as cellular division and death in the adult organism. These master regulators may therefore hold the key to elucidating or even treating a variety of diseases.

To make sense of lncRNAs, scientists are trying to understand how they appeared in the genome and whether they can be grouped into classes according to their activity. In a recent study published in the journal Genome Biology, Ulitsky and his team -- research students Hadas Hezroni, Gali Housman and Zohar Meir, and staff scientists Drs. Rotem Ben-Tov Perry and Yoav Lubelsky -- managed to identify a class of mammalian lncRNAs that had evolved from more ancient genes by taking on new functions.

The scientists started out with the assumption that evolution is an economical process: If a gene loses its function, it is likely to be "recycled" for different purposes in the cell. "Just as bricks from a ruined monument can help to build a new house, so genes that went out of use can find new roles in the cell in the course of evolution," Ulitsky explains.

His team members developed a series of algorithms that enabled them to find such "recycled" genes in the mammalian genome. First, they identified nearly 1,000 genes that code for proteins in chickens, fish, lizards and other non-mammalian vertebrates, but not in humans, dogs, sheep and other mammals. The scientists hypothesized that at least some of these genes, after losing their protein-coding function, started manufacturing lncRNAs in mammals. By comparing "gene neighborhoods" in the vicinity of lncRNAs and of genes that had stopped coding for proteins, the researchers revealed that indeed, about 60 lncRNA genes in mammals -- or 2% to 3% of lncRNAs shared by humans and other mammalian species -- appear to be derived from ancestral genes. Their genetic sequence is in some cases similar to that of the ancient genes, but they have lost their protein-coding ability.

"It is hard to know what caused these genes to lose their protein-coding potential more than 200 million years ago, when mammals evolved from their vertebrate ancestors," Ulitsky says. "But the fact that these genes have been conserved in the genome for so long suggests that they play important roles in the cell."

Identifying such "fossils" of protein-coding genes in the mammalian genome will facilitate further study of human lncRNAs and may ultimately help scientists understand what happens when their function is disrupted. For example, lncRNAs help create different types of neurons in the fetal brain; their failure to properly determine the fate of these neurons may contribute to epilepsy. Because lncRNAs are involved in controlling cell division, their malfunction may be implicated in cancer. Finally, manipulating lncRNAs may make it possible to treat certain genetic disorders.

Explains Ulitsky: "In recent years, lncRNAs were found to be important for the activation or repression of genes relevant to a variety of disorders. It may one day be possible to treat these disorders by targeting the lncRNAs so as to reprogram entire gene regulatory networks. For example, in a study in mice, researchers at the Baylor College of Medicine in Houston, Texas, had averted progression of Angelman syndrome, caused by mutations on chromosome 15, by silencing a particular lncRNA -- to unleash expression of a gene that it represses."