For most people, "protein" is just a line on a nutritional information label. But many of the medicines developed through modern biotechnology—human insulin, vaccines, cancer treatments, and more—are based entirely on proteins.

The protein listed on nutritional information is generally a large, complex collection of hundreds or thousands of distinct molecules. Every protein in chicken muscle and fat cells, for example, undoubtedly falls on the "thousands" end of the spectrum, and that's before considering things like spliced variants and the common chemical modifications of proteins. By contrast, medicines have to be a single protein, largely pure of contaminants that can produce allergic reactions or other off-target effects. How do we get from thousands to one?

Since each protein is chemically distinct, isolating it takes advantage of its distinctive chemistry. Decades of biochemical research have left us with a number of ways to use that chemistry to separate out a protein from a mass of others. Biochemists, after all, need to isolate pure proteins to study their activities and structure—sometimes for fundamental research and sometimes while on the road to developing a medicine. Fortunately, their methods have scaled up for industrial production.

Supported by columns

It's relatively easy to get a mix of proteins out of cells; you simply blow up the cells. Detergents act to open holes in membranes, allowing the contents of cells to spill out. The debris of the cell—the membranes and DNA—can then be removed by spinning the resulting mixture in a centrifuge, which will draw the heavier debris to the bottom. (If you actually want to purify a membrane protein, which tends not to dissolve in water, then you typically need to re-dissolve the debris and perform most of the ensuing purification in other solvents.)

For some larger proteins or complexes of proteins, it's possible to use the centrifugation step to partially purify them. It's possible to mix in a solution of a heavy substance, which will form a density gradient within the centrifuge tube (low at the top, high on the bottom). Since large proteins have a well-defined density, they'll settle to a specific vertical location along the gradient. But many proteins we're interested in, like insulin, are too small for this to work.

After that, you go through a series of steps that are designed to make your protein stick to stuff, while most other proteins don't. There will typically be annoying proteins that stick as well, so often you need to perform several of these sticking steps in series, each one removing more of the unwanted proteins each time.

These steps are performed in what are called "columns"—glass tubes filled with a material (typically beads or a gel). These let solutions flow through while holding up a subset of the proteins. There are typically two types of columns. In one, which I'll call a "sticky column," your protein sticks strongly and will stay there while most other proteins flow past. In the other, which I'll call a "slow column," proteins move at different speeds. You just have to figure out when yours comes out the other end.

For a slow column, the mixture of proteins starts on top of the column, and then the physical properties of the protein—its size and complexity—slow it down as solution flows through the column. For example, a gel filtration column consists of a thick mesh of microscopic fibers that proteins have to maneuver through as the flow of liquid down the column drives them down.

Alternatively, a size exclusion column contains beads with lots of tiny pores with a defined size. Proteins that are small enough can diffuse into the pores, and thus travel slowly. Proteins above the size cutoff, by contrast, can't get into the pores and flow through quickly.

For slow columns to work, solvent is continuously dumped onto the top of the column, causing a flow through to the bottom. Depending on how strongly your protein of interest is slowed down, it'll come out the bottom a set time later. This isn't especially precise, though, since the protein moves at an average speed. If you plot the protein exit from the column vs. time, you end up with a Gaussian (bell) curve. So, you collect the solution over a period of time that captures the majority of the period where there's lots of your protein around.

Collecting Fractions Fluid enters the top of the column at a constant rate. Vast numbers of proteins gradually make their way through the column with the liquid. But they don't all move at an identical speed; some end up going a little faster, some a bit slower. If you plotted the amount of protein coming out of the column over time, it's likely to trace a bell curve. Fluid enters the top of the column at a constant rate. Vast numbers of proteins gradually make their way through the column with the liquid. But they don't all move at an identical speed; some end up going a little faster, some a bit slower. If you plotted the amount of protein coming out of the column over time, it's likely to trace a bell curve. Unless your protein happens to be colored, there's no way to tell by looking when it's coming out of the column. So biologists typically collect what are called fractions: samples of a fixed volume from the liquid coming out of the column. So if you were collecting 1ml fractions, you'd simply shift the column flow to a new tube each time a millilitre had been dispensed. Your protein will typically be distributed over several consecutive fractions. Afterwards, you can take a small sample from each fraction and test it to see if your proteins are in it. Once you find all the fractions that contain your protein, you can combine them and use them for further experiments.

Sticky columns work somewhat differently. Here, your protein enters at the top and then sticks to the material in the column. You can essentially run infinite amounts of solvent through the column to get rid of any gunk that came with the protein, cleaning up things considerably. And, once it's clean, you add a different solution to the top, one that disrupts the interaction between your protein and the sticky column. Again, your protein will flow out of the bottom (again, showing a bell-shaped pattern over time).

One option for a sticky column is based on charge. Most proteins contain amino acids that can take on a charge depending on the pH of the solution they're in by gaining or losing a hydrogen ion. So by setting the pH of a solution carefully, you can ensure that your protein is positively or negatively charged. The protein mixture can then be run on an ion exchange column. Here, the column materials have a charged surface that's neutralized by the presence of ions (chlorine or magnesium, for example). When a protein of the opposite charge comes along, it can displace the ions, sticking to the surface in their stead. In essence, you're exchanging ions for a charged protein.

The protein will stick there as others flow through, and it will remain stuck until you change the pH, at which point ions can be re-exchanged.

A similar attraction is used for reverse phase chromatography, except here, the attraction is hydrophobic (dislike of water). High salt causes hydrophobic areas on the protein to stick to the hydrophobic surface of the column. The protein remains stuck until a more hydrophobic solvent—usually some mixture of water and an alcohol—is sent through the column.

In some cases, it's possible to get very specific interactions. For example, if two proteins are known to interact, and you can stick one to the surface of a column, you can get the second to stick in the column until you change the pH and add salt. This approach is called affinity chromatography, and it doesn't even require two proteins to interact. It's possible to fish out DNA-binding proteins by coating a column with the DNA sequence they recognize.

A specific case involves the use of antibodies. If you can make an antibody to your protein, you can simply coat the column surface with the antibody—your protein will get stuck. It's also possible to genetically engineer the gene encoding your protein to add a "tag"—a short stretch of amino acids that an antibody recognizes. This lets you purify the protein even if you don't have an antibody to it. (This is more important than it sounds, since you typically need to purify a protein to make antibodies, creating a catch-22.) If the tag doesn't alter the protein, it can be left in place. Alternately, you can also modify the genes to add a sequence that an enzyme recognizes and cuts, removing the tag after purification.

A final approach involves metal affinity. Some amino acids have an affinity for certain metals. If you happen to have some on the surface of the protein, they'll stick to a column coated with the appropriate metal. One of the strongest interactions is between the amino acid histidine and the metal zinc. In the same way that you can engineer a gene to contain a tag, it's possible to add a stretch of six histidines to the end of a protein. Your protein will then stick until another chemical that likes zinc gets added to the column.

As mentioned above, unless you have a specific affinity worked out, you'll usually get a number of additional proteins coming out of the column at the same time as your protein of interest. As a result, several of these techniques are often used in series in order to get a protein pure enough to study.

The remarkable thing, however, is that these techniques scale up. While researchers may purify nanograms of protein on columns a few inches long, it's possible to make enormous columns for producing proteins on industrial scales. All the same principles apply, and it's possible for the same procedure used in the lab to be used in a production facility.