Introduction

My main research focuses on using DNA molecules to study polymer physics. Theoretical polymer physics is based on the thermodynamic behaviour of microscopic chains, but experiments examining single chain behaviour wasn’t possible until improvements in fluorescence microscopy in the 1990s allowed us to visualize the dynamics of DNA molecule in real time. DNA makes a good model system to study polymer physics because it is big enough that we can observe single-molecule dynamics in a microscope (unlike synthetic polymers like polyethylene), but small enough that thermal fluctuations are still the main driver of physics (unlike macroscopic polymers like spaghetti). I have written a few articles, on PhysicsForums and on my blog, about some of the interesting physics problems that arise is in this field. When I tell people what I do, many ask me what the practical applications are. Although I am primarily motivated by figuring out the physics and less by the actual applications, it is a fair question, and there’s a reason people are paying me to do this. This article will talk about a few genetic sequencing applications related to my area of research, and some of the physics problems associated with them.

Nanochannel Genomic Mapping: Reading DNA squished into tubes.

DNA, as a polymer, takes on a spatially disorganized conformation. A molecule floating around in the solution will take the form of a self-avoiding random walk, and in a cell the DNA is packaged in an extremely complex manner that we’re just beginning to understand, in a structure that some call a fractal globule. Common to both these conformations is the fact that there is not a direct correlation between spatial position and genetic position: you pick a spot along the DNA, then move a micron over in space, and you’ve gone thousands of base-pairs along the molecule.

However, if you stretch a DNA molecule out so that it adopts a linear conformation, now there is a very strong correlation between spatial position and genetic position: you move forward in space, you move along the molecule. There are a few ways to stretch out DNA: you can comb it on a surface, stretch it with optical tweezers, or attach a magnetic bead to it and pull it a bigger magnet. The method I’m going to talk about, however, involves stretching out molecules in a small tube, also known as a nanochannel. Nanochannels are typically on the order of 100 nanometers wide, and are etched into glass using similar technology used to make silicon computer chips.

Normally a molecule will fluctuate randomly in three dimensions and take on a high-entropy shape called a random coil, but if it can’t do that because of the walls confining it, it has no choice but to extend along the tube, such that its equilibrium conformation is a stretched one. This may seem obvious, but even fifteen years ago it was not: people had to deduce experimentally that a DNA molecule that is twice as long will extend twice as far in a tube. I wrote about self-avoiding walks in a previous article, and showed that in d dimensions, a self-avoiding walk increases in size with respect to length with an exponent of 3/(D+2), which is 0.6 is 3D, 0.75 in 2D, and 1 in 1D (these are called Flory statistics). It was found in 2005 that DNA confined in nanochannels will extend twice as far if the molecule is twice as long, meaning that 1D Flory statistics apply, even though it’s a 3D system pretending to be 1D.

What kind of information can you get from looking at a stretched out molecule? You certainly cannot get sequence information, because the resolution is not nearly good enough. You can however, look at large-scale structure, which current sequencing techniques cannot access, because they have to stitch together tiny strands of DNA. For example, some cancers are marked by segments of one chromosome being transferred to another.

Antibiotic strains of bacteria might have different segments of their genome rearranged. These are effects which you could see with a coarse-grained map but not a fine-grained disorganized sequence. The way to get genetic information from these stretched molecules typically involves looking at the positions of fluorescent labels. For example, you can attach fluorescent dyes to certain base-pair sequences, and look at how far apart the different dye molecules are under microscope. There might be, for example, one strain of bacteria where you expect these fluorescent tags to be 5 microns apart, and another where you expect them to be 15 apart, and you could distinguish them by stretching out the molecules in a nanochannel. Current sequencing technologies have base-pair resolution, but require many small fragments of DNA to be pieced together after sequencing, which loses information about large-scale structure and cell-to-cell variation.

When a molecule is confined to a tube, it will extend farther down the tube if the tube is smaller, because it has less room to fluctuate in the transverse directions. But what is the relationship between the diameter of the tube, and how far the molecule extends? That turns out to be a very complicated question that people have spent over a decade trying to solve. If the tube is very small, you can treat it like a chain of rigid rods deflecting back and forth between the walls (this is called the Odijk regime), and the molecule extends almost completely, with a shortening corresponding to the cube-root of the channel diameter. If the tube is comparatively wide, you can treat the molecule as a chain of connected “blobs,” and using some of the theory I discussed in a previous article (the deGennes regime), you can show that the extension decreases with the two-thirds power of the channel diameter. However, in every experiment, the nanochannels are too big to put the molecule in the Odijk regime and too small for the deGennes regime, so the tricky part is describing the interpolation between the two regimes, when the diameter of the channels is roughly the same as the rigidity length-scale of DNA (50 nm). There have been a huge number of papers in the last five years just trying to more accurately measure and model this relationship, deducing the existence of intermediary regimes characterized by different amounts of fluctuation in the extension of the molecule.

Currently, there is one company that actually does this: BioNanoGenomics uses 45 nm nanochannels etched in silicon to produce large scale genomic maps using sequence-specific fluorescent tags, and has been involved in a few of the papers trying to understand nanochannel polymer physics. A lot of their business right now I believe involves genetic analysis in the agriculture industry.

Nanopore Sequencing: Reading DNA by threading it through a hole.

A nanopore is basically a really small hole, typically between 1 and 20 nanometers in diameter, drilled through a membrane typically about 30 nanometers thick, often in silicon nitride. The membrane sits in a tube such that all the fluid flows through the narrow pore, and the tube is filled with salty water, and a voltage is applied across the pore. The current of ions driven through the pore by the electric field can be measured, which depends on the concentration of salt, the strength of the electric field, and the diameter of the pore. However, if there is a DNA molecule in the solution, as it translocates through the pore, it can physically block the flow of ions, leading to an observed reduction in the overall current. If this reduction in the current can be read with high enough resolution (both in time and current), the blockages due to the individual bases can be pieced together to construct the sequence of the molecule.

Nanopore sequencing is extremely difficult technically. There are many physics and engineering problems, not just involving DNA polymer physics, that must be solved. Nanopore membranes are typically so sensitive that they break if you breathe the wrong way around them and the signal-to-noise ratio is typically very low; the experiments typically take place in multi-layer vibrationally insulated Faraday cages. There are a number of physical challenges associated with understanding nanopore translocation. A polymer typically adopts a high-entropy configuration called a random coil (what it sounds like), and getting one end to go through a really small hole involves a massive local reduction in entropy. Overcoming this loss typically requires a huge electric field, which leads to a statistical “capture” process before one end of the molecule gets driven through the pore, and a distribution in the time it takes for a molecule to translocate. Because the electric fields have to be so strong, once the molecule starts translocating it typically takes milliseconds to zoom through the pore, making it very difficult to read the current over such a short time. You can strengthen the signal by upping the voltage across the pore, but this increases the translocation speed and makes the results harder to analyze. A lot of research on both the science and engineering side has gone into thinking of clever ways to slow down the molecule (for example by tying it in knots), and to improve the signal quality of the pore (for example by making it as narrow and thin as possible, in a layer of graphene).

There is a company called Oxford Nanopore that sells a DNA sequencing USB stick. They solved the speed problem by harnessing pre-existing biological architecture: a helicase enzyme sits atop a biological pore (one that would normally be used to transport molecules into cells), and it uses ATP to open up the DNA one base at a time and ratchet one strand through the pore. This is still a very noisy measurement, and the device accounts for this by running hundreds of pores in parallel. Although I was skeptical at first, these things actually exist, and apparently actually work.

So, while my research is about understanding the statistical mechanics of polymers and answering the interesting questions that arise from that, there are a few useful things that can be done by squishing molecules into tubes or through holes.