To the best of our ability to tell, everything on Earth shares a few common features. It encodes information in DNA using four bases, A, T, C, and G. Sets of three consecutive bases are used to code for a single amino acid, and most organisms use a set of 20 amino acids to build proteins. These features appear everywhere, from plants and animals to bacteria and viruses, suggesting that they appeared in the last common ancestor of life on Earth.

This raises a question that comes up a lot in evolutionary studies: are these features used because they're in some way efficient, or did we end up stuck with them as a result of some historic accident?

A team of California-based researchers has been building an argument that it's an accident. And it's doing so by expanding life beyond the limitations inherited from its common ancestor. After having expanded the genetic alphabet to six letters, the team has now engineered a bacterial strain that uses the extra letters to put an unnatural amino acid into proteins.

New chemistry

The four bases in DNA are chemicals with a specific structure: flat rings with nitrogens and oxygens that can participate in hydrogen bonding. They form a specific number of bonds that allow them to pair: A forms two with T, while G forms three with C. These bonds hold together the double helix of DNA, but they also allow DNA to be transcribed into RNA, which uses a similar set of four bases (with T replaced by its close cousin U). And RNA uses this base pair to match sets of three bases to encode a specific amino acid.

Sets of three bases can encode 64 (4 x 4 x 4) possible items, but there are only 20 amino acids. So there's already room for some flexibility in terms of changing the genetic code. Unfortunately, all of life is already using all of those original 64 possible combinations to mean something else. So changing its meaning would require re-engineering the entire organism's genome.

Some researchers at Scripps, collaborating with a biotech company, decided to just expand the possibilities by adding two more bases that can interact with each other but not any of the four existing bases. To make sure that works, they got rid of the hydrogen bonding entirely; instead, the new bases interact through hydrophobic contacts—they stay stuck together because neither interacts well with the watery environment around them.

It's easy to draw out the bases' structures and see that they'd fit into DNA. And it's possible to show that they work in a test tube. But getting them to work in an organism is a different matter entirely.

Reimagining life

Cells don't make these artificial bases, and there are no enzymes that can. So the researchers have to supply them. But they're needed inside cells, which means they have to cross the membrane. It took a bit of searching for the authors to find a protein that would transport them across. Given all that, and the fact that some DNA with the artificial bases are already incorporated in them, bacterial cells would continue to use them.

Getting these new bases to be used as part of the genetic code is another, much more complicated matter. The three-base genetic code is translated using RNA, so the researchers needed to supply cells with an RNA version of the bases as well. The translation involves small transfer RNAs (tRNAs), which match the three-base code and are chemically linked to an amino acid. So a tRNA gene had to be supplied with one of the artificial bases. Finally, there needed to be an enzyme that chemically linked an amino acid to this tRNA.

To start with, the team decided to work with one of the amino acids that the cell uses already. This ensured that it already had the enzymes needed to link an amino acid to the tRNA—all they needed to do was add a tRNA gene with an artificial base. Once they did, the cell would take care of the rest.

To make sure this worked, the team also engineered a gene that encodes a fluorescent protein so that it also had an artificial amino acid in the middle of it. In normal cells that haven't been engineered, the cell has no way to deal with this, and the gene can't be translated into protein, so the cells won't glow. But add their engineered tRNA, and the cells could make the protein and would glow green. So, the system works.

But it works in the sense that it simply mimics what biology does already. Part of the point of this work is to get biology to do something new.

The researchers next took advantage of a rare species that uses a 21st amino acid (N6-[(2-propynyloxy)carbonyl]-l-lysine, since you asked). It has the enzymes to attach that amino acid to a tRNA. So the researchers took the gene for that enzyme and the gene for the tRNA, and they modified the tRNA to include an artificial base. When engineered into bacteria, this combination led them to insert this strange amino acid into the fluorescent protein, again allowing the cells to glow green.

The key thing here is that, collectively, this system takes life where it has never been before (at least since everything on Earth shared a common ancestor). The bases being used had never been in a living organism prior to this work. And they're being used to encode an amino acid that's only used by a handful of species in a completely separate domain of life. In doing so, the researchers have taken a genetic code that has 64 possible states and expanded it out to one that has 216.

Implications

There are a lot of implications to this work, so it's worth spending time to go through them. To begin with, it really does imply that life's choice of chemistry has been limited by a historic accident. Yes, everything about life requires nucleic and amino acids. But there's apparently a great deal of flexibility when it comes to what those nucleic and amino acids look like. That makes sense, given that the enzymes that make proteins already have to deal with 20 very distinct amino acids.

This also means we have a great deal of flexibility if we want to get organisms to use different amino acids. Enzymes can do many remarkable things given their relatively simple chemistry, but there's definitely the potential to expand on that—enzymes with phosphorus or boron in their structure, or metals chemically locked into a catalytic site. It opens up a whole world of chemistry, one that could see enzymes move into areas where they haven't seen a lot of use yet.

The lure of updating life's chemistry has been so strong that there's a group out there that's systematically editing a bacterial genome in order to free up a single three-base code for use with an artificial amino acid. The technique shown here is far more powerful, in that it opens up more than 150 new three-base codes. Which means we could potentially use multiple artificial amino acids in a single protein.

Which brings us to the technique's limitations. Since these artificial nucleic and amino acids aren't part of life's repertoire, no living organism knows how to make them. Which means they have to be supplied, and there has to be a way of importing them into a cell. For the artificial amino acids, it means there has to be a way to specifically link them to a tRNA for use, which means modifying an existing enzyme until it works with something new. That's not going to be easy.

But the key thing about getting this to work in living organisms is that it opens up an incredibly powerful tool for use: evolution. If we can make an organism's survival dependent upon our own poor implementation of some artificial chemistry, then a few hundred generations—meaning less than a week—will likely leave us with a finely tuned system.

Nature, 2017. DOI: 10.1038/nature24659 (About DOIs).