DNA Structure and Topology DNA GEOMETRY DNA is a double-stranded macromolecule in which each strand is a polymer of deoxyribonucleotide monomers. The monomers are linked by "phosphodiester bonds" between the 3' carbon of one deoxyribose molecule and the 5' carbon of another. Since an ester is a condensation product of an alcohol (sugars are "polyalcohols") and an acid ( PO 4 3- is an acid), the term "diester" signifies the formation of two ester linkages by each phosphate molecule. The following is a dimer of the deoxribonucleotide monomers adenine (A) and guanine (G), drawn such that A is to the left of and above G. According to the naming convention, we would call this: Deoxyadenynyl-3',5'-guanylyl-3'-phosphate and we would use a shorthand notation, d(AGp), to represent the structure. Notice that a phosphate group acts as a bridge between the two deoxyribose sugars, and that the order, from left to right (also from top to bottom here) is from the 3' carbon of the first sugar to the 5' carbon of the second. The two bonds thus formed are ester bonds. The second sugar has a phosphate hanging from its 3' carbon, and this is indicated by the end of the name, "-3'-phosphate". The "p" that follows AG in the notation d(AGp) represents this feature. Also, notice the negative charges on the oxygen atoms, as shown, in the phosphate groups. At physiologic pH, the hydrogens from these oxygens are ionized. By convention, these structures are drawn with the 5' end of the molecule to the left. It's designated "5'-" because the 5'- carbon of the first sugar is unesterified. It's the first piece of the molecule that is seen upon "reading" from left to right. If we were to replace the hydrogens on the 2'- carbons of the sugars with -OH groups, then we would have a dimer of ribonucleotides, in this case AGp. Exercise: Draw the CG dinucleotide as d(CpG). This dinucleotide is present in the vertebrate genome at only about 20% of what would be expected, statistically, except in upstream regions of many genes. These regions are known as "CG islands" and it is in these regions that gene expression can be modulated by methylation of cytosines in the 5th position. DNA probably is in a Z-conformation in these islands. So much for the naming/drawing conventions. What's really important to recognize, however, is that the conventions specify "direction" and this concept will appear over and over again as we study DNA. The dimer above has a direction indicated by the notation 5'--> 3'. A DNA molecule is composed of two strands of deoxyribonucleotide polymers, in a very special geometric relationship in which one is entwined about the other such that an overall helical shape results. This is the familiar "double helix", described by Watson and Crick, in which the two helices share a common axis, and both are wound in a right-handed manner. A "right-hand" rule is a mnemonic that will allow you to always visualize this directionality correctly. Make a fist with your right hand, with the thumb pointing upward. As the helix rises in the direction of the thumb, the fingers curl in the direction of the turn. Each nucleotide base of one strand is paired with a nucleotide base on the other strand to create a stable structure of the two polymers. Early on, Erwin Chargaff recognized that, in DNA molecules, the number of A molecules equaled the number of T molecules, and that the number of G molecules equaled that of C molecules. A similar relationship did not hold for RNA, however. This was before the discovery of the double-helical structure of DNA, and it is explained by the further demonstration that the pairing of bases is not random, but rather follows the rule that an A pairs with a T and a G pairs with a C. These relationships are known as "complementarity rules". The nature of the forces between the complementary bases is hydrogen bonding and vanderWaals forces and the hydrophobic force. We will say more about these later. It turns out that it takes a stretch of 10.5 paired nucleotide bases to make a complete turn of the helix. When we look at DNA topology below, we will approximate this as 10 bases per turn to make the mathematics simpler. Looking at the dimer drawn above, then, we can pair it to its complementary dimer, d(CTp) ,where we have written from the direction 5' -3', but we cannot appreciate the double-helical nature of DNA until we have 10-11 bases on one strand paired with their complements on the other. (Note that this discussion has centered on DNA. RNA is not a double-stranded helical molecule. It is usually single-stranded, although small stretches of it may be doubly wound if the right relationship between contiguous bases in the "linear" strand exist. RNA does not, therefore, display "Chargaff"s Rules".) It should be readily apparent that the direction of the strand complementary to the 5'-3'- strand is 3'-5'. The double helical structure of DNA can be described as analogous to a helical staircase, with the two chains of sugar-phosphate bonds representing the rails. Since the bases are attached to the sugars, and each base is paired to its complementary base, the edges of the bases are exposed to solvent within two grooves along the helix, the "major groove" and the "minor groove". It is within these grooves that DNA interacts with solvent molecules, ions, protein and other molecules. Structural variation of these grooves is one mechanism by which reactivity of DNA is modulated. There are three major structural variations that we will come across, "A", "B" and "Z" DNA, and they differ in the relationship between the bases and the helical axis: B-DNA : This is the structure of fully hydrated DNA, and is the most common encountered in vivo. There is about a 6o tilt of the bases to the helix axis and the axis goes through the center of the base pairs. The diameter of B-DNA is 20 A, and there are about 10 bases per helical turn (36o of twist along the helix for each base pair). The major groove is wide, while the minor groove is narrow. Owing to the location of the helical axis in the center of the base pairs, the edges of the base pairs are about equally deep in the interior. Exercise: Study B-DNA structure A-DNA : When B-DNA is dehydrated, there is a reversible structural change to A-DNA, in which there is an increase in the tilt of the bases to about 20o with respect to the helical axis, which does not pass through the base pairs at all, but rather is shifted into the major groove. The result is a structure with a narrow and deep major groove and a wide and shallow minor groove, a diameter of about 26 A, and 11.6 bases per helical turn. When looking at A-DNA end-on, there is a 6-A diameter hole at the helical axis. Z-DNA : Unlike B-DNA and A-DNA, Z-DNA is a left-handed helix. It has a base tilt of about 7o and a diameter of about 18 A. There are 12 bases per helical turn, a narrow and deep minor groove and a flat major groove. Sequences of DNA consisting of alternating purine and pyrimidine bases have been shown to adopt this conformation. Such sequences include poly d(GC), poly d(AC), poly d(GT) etc. in varying combinations and can be formed at high salt concentrations, under dehydrating conditions in the presence of Mn2+ or Co2+ , or in "supercoiled" DNA. The conformational change from B-DNA to Z-DNA is one mechanism for relief of the torsional strain found in B-DNA in vivo, and may serve as a switch mechanism to regulate gene expression. DNA TOPOLOGY DNA in its relaxed (ideal) state usually assumes the B configuration, a right-handed 20A diameter helix in which the nucleotide base planes are nearly perpendicular to the helix axis, with a vertical distance of 3.4 A between them and with10 base pairs per helix turn, giving a "pitch" of 34 A. Linear DNA in solution assumes this configuration because it is the one of minimum energy. Any deviation from this relaxed state increases the energy of the DNA molecule. A piece of circular DNA with a diameter that is large when compared to the double helix thickness will have an energy only slightly greater than that of the same length of DNA in its linear form, because the curvature of the molecule is small. The helix axis of DNA in vivo is usually curved, rather than linear. This is a necessary condition, as the stretched length of the human genome is about 1 meter and this length needs to be "packaged" in order to fit in the nucleus of a cell. In eukaryotes, nature solved this problem by complexing linear DNA to histones (protein) to form nucleosomes. In prokaryotes, the entire genome is typically a circular DNA molecule and this, in turn, exists in further compact form in which the helical axis does not lie in a plane. This packaging of DNA deforms it physically, thereby increasing its energy. Such an increase in stored (potential) energy within the molecule is then available to drive reactions such as the unwinding events that occur during DNA replication and transcription. Too much stored energy is not necessarily a good thing, though. Think about the consequences of disturbing a typical explosive with tremendous strain energy stored in its rings. In nature, this problem is addressed by having DNA form supercoils, in which the helical axis of the DNA curves itself into a coil. The supercoil or superhelix structure is one of nature�s solutions to the problem of minimizing the excess energy that builds up when DNA molecules are deformed during the process of storage. At the same time, whatever energy that is stored gets put to good use, as mentioned previously. We will look at the supercoiling of closed circular DNA that occurs in prokaryotes, and develop a physical and mathematical model that describes this phenomenon. This will serve as an introduction to chemical topology and will provide insight into the workings of a class of enzymes known as "topoisomerases", which control DNA supercoiling. We already know that the linear form of DNA is the one of minimum energy and, if we can�t have linear DNA, then the next best thing would be to have a closed circle of DNA with a "large" diameter. If you take a length of double-stranded DNA and tie the ends of each strand together, you have circular DNA. In doing this, you have curved the axis of the double helix and, if you are careful, you can form a circle (the helical axis lies in a plane). (1) Take 2 pieces of electrical wire, each with a different color, and wrap one piece around the other to approximate a double helix. In this construct, each wire represents a sugar-phosphate backbone of a DNA molecule. Connect the light-colored ends and place it flat on the table.The axis around which the wires wind is now circular in shape, and parallel to the tabletop. This configuration, with the axis forming a circle in a plane, is the loop of minimum energy. You can deform the circle to any other kind of loop in the plane, but the resulting loop of DNA will have a slightly higher energy. If you further deform the loop so that it is no longer confined to the plane, the energy increases. If you fold part of the loop over itself one or more times, the energy will be higher still. Topology is the branch of mathematics that studies properties of objects that are invariant to continuous deformations of the object. In chemistry we often talk about various kinds of isomers, molecules with the same formula but with different configurations. Sometimes the connectivity of the atoms is different, sometimes the molecules are non-superimposable mirror images, and so forth. In a topoisomer, the atom connectivity (the covalent bonds) are identical but some other topological property is altered. In the loop of double stranded DNA, each strand maintains its identity in terms of atom connectedness, no matter how you deform it. Furthermore, the number of times that one strand wraps around the other also does not change during deformation. This property is called the "linking number" L k and it is a topological invariant. If you want to change the linking number, you must break a bond in one of the two strands of DNA. Without that, the two strands are permanently linked to each other by a "topological bond". The topological bond is not a covalent bond, but some covalent bond must be broken in order to change the linking number. Topoisomerases are enzymes that can change the linking number of circularly wound double-stranded DNA. (2) It might be difficult to look at a piece of circular DNA and determine the linking number. We can do a topological experiment with the 2 twisted wires that we formed into a loop to get more insight into the concept. Since the linking number is a topological constant, that means that no matter how we deform the object, provided it is a continuous deformation (no "bonds" are broken) then the linking number remains the same. Pull one of the wires such that it is stretched into a circle along the plane of the tabletop. The other wire is twisted around it. If you stretch an imaginary membrane or drumhead over the wire circle, you can ask yourself how many times the other wire pierces the drumhead from above (or below). This number is the linking number. To get a physical understanding of how a topoisomerase changes the linking number, you can do another experiment with the wires: (3) Carefully detach the ends of one of the wires, leaving the other wire connected to itself. Swing one end of the loose wire a full turn around the connected wire (in either direction) and reattach the ends. Now determine the new linking number. It should have changed by � 1. If this were DNA, and the first structure that you made with the wires represented the relaxed state, then the new structure with the new linking number is at a higher energy (less stable). To partially relieve the strain introduced by the change in linking number, the DNA must distort in other ways. There are, physically, two ways that the DNA can do this : by "twisting" and "writhing". Twisting and writhing turn out to be geometric, rather than topologic, properties; that is, they do change when the molecule is deformed. It is the combination of twists and writhes that impart the supercoiling, and these occur in response to a change in the linking number. Remember, it is the topoisomerases that can change the linking number (some directly and some indirectly; see below), and it is the change in the linking number, D L k , that turns out to be the measure of the supercoiling. If you compare the D L k to the L k in the relaxed state (both of which must be integers), you get a ratio: D L k /L k = s = the superhelical density. A s of 0.1 means that 10% of the helical turns in a sample of DNA (in its B configuration) have been removed. This underwinding results in negative supercoiling. In a cell, s is usually 5 � 7%. Supercoiling turns out to be a very common phenomenon in DNA. The polyoma viruses and human papilloma viruses contain supercoiled DNA, and mitochondrial DNA is also supercoiled. The majority of small genomes, including genetic factors for fertility and drug resistance, are supercoiled. In order for a vector (like a bacterial plasmid) to be integrated into a larger piece of DNA , it must be supercoiled. Natural DNA circles are "underwound"; their linking numbers are less than their corresponding relaxed circles. They compensate by twisting and writhing (i.e.,supercoiling). The supercoiling is not random; it is highly regulated by the cell and, in particular, each type of cell exhibits its own characteristic supercoiling. If you start with a circular piece of double stranded DNA and deform it, without making any cuts in the strands, the linking number remains constant, while the twist and writhe change. The mathematical relationship between these quantities is: L k = T + W and what is really interesting about this relationship is that a topological property is equal to the sum of two geometric properties. Unfortunately, twist and writhe are not at all intuitive concepts. In the special cases in which axis of the double helix remains in a plane or on the surface of a sphere, then twist equals the linking number, and there is no writhe, but all other cases are considerably more complex. In order to get a better feel for these geometric properties, it is helpful to look more closely at our model for DNA and at the mathematics of supercoiling. A simple model for DNA is that of a ribbon, where each edge of the ribbon represents a sugar-phosphate backbone, and with one edge having a "direction" opposite to the other. The center of the ribbon is its symmetry axis, though, and when you twist the ribbon to approximate a linear piece of double-stranded DNA, the symmetry axis winds around the axis of the double helix. (4) Take a strip of paper, about 6"� 12" long, and draw a line (the axis) down the center of it. Now, wrap the paper around a dowel rod (a pencil will do). Notice how the axis curves in a helical manner. The mathematics ars a lot simpler if the symmetry axis coincided with the double helix axis, and the ribbon representation of the DNA can be modified as follows to achieve this. If you look at a diagram of DNA, you will see that there is an axis, in the plane of each base pair, around which the bases can be rotated 180 degrees such that the base pairs switch places. Since the base pairs are not totally identical to each other, this is not a true symmetry axis, so it is called a "pseudodyad" axis. There is also a similar axis between each successive nucleotide pair. In the new ribbon model of DNA, a line drawn down the center of the ribbon represents the axis of the double helix and, when the ribbon is twisted around this line, the pseudodyad axes are always perpendicular to the surface of the ribbon at the double-helix axis of the ribbon. If you think about another line that is perpendicular to the plane formed by these 2 lines and that intersects the plane at the crossing of these two lines, then you can imagine a base situated at each end of this new line. The edges of the ribbon are assigned opposite directions. (5)Take a wide rubber band that has been cut and draw a line longitudinally down the center of the strip. Twist the ribbon around the line, gently pulling each end as you twist. The line is the helix axis. Pick any point on the line and insert a straight pin perpendicular to it. The pin represents the pseudodyad axis. That is the model for DNA that is typically used. You will see that, although the edges do not coincide with the sugar-phosphate backbone, when we manipulate this model, the edges will represent the DNA backbones. If you were now to connect the two ends of the rubber band together, you would have a circular piece of DNA in which there were no helices. In this case, the edges, which represent the backbone of each strand, are not linked, as there are no covalent bonds between them (only H-bond interactions between the bases). If you twist the ribbon a few complete revolutions and then attach the ends, now you have closed circular helical DNA. Now the edges are "linked". (6) Take the strip of paper and twist it one complete turn. Attach the two ends with a piece of tape. Take a scissor and cut carefully along the center line that you drew, and pull gently. The two loops are linked by a chain of two. Since each loop represents the covalent backbone of a DNA strand, the two strands are linked by a "topological bond". You cannot separate the strands unless you cut one of them (that is, you have to break a covalent bond). No matter how you deform these two curves, the linking number remains "1" Since the edges were given orientations, we have to take this into account when we determine how two curves are linked. If you had made a clockwise turn in the ribbon before joining the ends and cutting, the topological relationship would be opposite to that which would occur had you first twisted it counterclockwise. (The relationship between the two can also be seen by putting one of them in front of a mirror and looking at its reflection.) Given two curves linked in three-dimensional space, the following procedure will give you the correct linking number: Project the curves onto a two-dimensional surface, being careful to note which curve lies over the other whenever two curves cross. Whenever two curves cross, assign either a +1 or a �1 depending on which way the top piece must be rotated to coincide with the bottom. If clockwise, assign +1; if counterclockwise, assign �1. Add up all of the crossing numbers and divide by "2". (7) Determine the L k for each of the projections on the handout. Linking number as a topological constant should now be perfectly clear. This is important of you are to understand the physical and mathematical consequences of DNA being underwound in the cell (at least for part of the time). To summarize thus far: (1) Relaxed DNA is in the "B" configuration, with ~10 base pairs per helical turn. (2) The following structures are consistent with the relaxed state: (a) Linear DNA (either straight or curved) (b) Closed circular DNA, provided its axis lies in a plane or on the surface of a sphere. (3) In any other naturally found geometry, the DNA is either under- or overwound. Its helical axis does not lie in a plane or on the surface of a sphere because of writhing and twisting of it. This is the physical solution to the potential (torsional) energy minimization problem. (4) A linking number is (+) for right-hand turns and (-) for left-hand turns. (5) Linking numbers are invariant to continuous deformation. (6) Linking numbers can be changed by discontinuous deformation: cutting. (7) If the linking number is changed from its relaxed value, the amount of change, D L k , is the measure of supercoiling. (8) DNA in its natural state is underwound by about 5 � 7%. At this point, it�s a good idea to mention that supercoiling is not necessarily the only solution to the problem of normalizing the number of base pairs per helix in an unwound piece of DNA. You could also separate the two strands by breaking the hydrogen bonds between complementary bases in contiguous base pairs until the remaining DNA has the correct number of base per per turn. In terms of energy needed, though, it requires a lot more energy to break the H-bonds than to supercoil. Nevertheless, strand separation does occur during replication and transcription and it turns out that it is the physics of the underwinding that facilitates the strand separation. Cruciform structures also require some unpairing of the base pairs and, again, it is the underwinding that maintains the required strand separation. How, then, are linking numbers of DNA changed in the cell? Two classes of enzymes are involved in changing of a linking number: (1) Type I topoisomerases: They break one strand, rotate the end of the broken strand around the intact strand, and then seal the ends. ("Nicking-closing" enzymes) (2) Type II topoisomerases: They break both strands, rotate both ends by 360 degrees and then reconnect the respective ends. In this case, D L k = � 2. In E.Coli, type IA includes Topoisomerase I and Topoisomerase III, and they generally relax DNA by removing negative supercoils, increasing the L k in units of +1. Type II topoisomerase, of which "DNA gyrase" is an example. It can introduce negative supercoils by decreasing the linking number in units of "2". In eukaryotic cells, there are two Topoisomerase Is (IA and IB) and one Topoisomerase II. Both types can relax both positive and negative supercoils, but neither can introduce negative supercoils (neither can underwind DNA). To understand how topoisomerases work, it is necessary to look more closely at how the linking number is related to twisting and writhing. We already stated that L k = T + W, and that T and W are geometric, structural properties whose values change during deformation. When you turned the strip of paper 360 degrees before taping together the ends, you imparted a twist to it. Twist has something to do with spatial relationships between neighboring base pairs. The next exercise will give you a more intuitive feel for the concept of twist. (8) Take the wide rubber band with the line drawn down its center and draw series of arrows perpendicular to it down the entire length and pointing to one edge only. Now twist the ends 360 degrees in a right-handed direction by holding each end, twisting clockwise, and pulling gently. (By convention, such a twist has a value of +1). The arrows swing around the helical axis, and their 2-dimensional projection "shrinks" down to nothing at the first node, only to reappear at the second node steadily increase again. The rate of rotation of the arrow with respect to the length of the helix axis is the twist. If you twist the ends by 720 degrees, you will see that the arrow rotates around the axis twice for the same length. Twist is altered by deformation and twist is a local phenomenon. The total twist is the sum of all of the local twists. The trivial case of twist = 0 results from connecting the ends of the rubber band without imparting any torsional strain (by twisting) to the band. Here, you can see that the arrows remain up (or down) along the circumference and don�t change in length. But, in this example, the L k is zero, also. (9) Twist the ends of the band a full 360 degrees clockwise, and notice the twist. Attach the ends (you can staple them together or just hold them together with your fingers). Let the rubber band relax on the tabletop. Notice that it will relax into a figure of eight, and that all of the arrows are pointing in roughly the same direction. What can you say about the total twist in this case? What is the linking number? Notice that, if the helical axis is constrained to lie in a plane, the twist, T, is always equal to the linking number, Lk. Twist can be geometrically quantified. Let the angle that the helical turn makes with the horizontal be " a ". If there are "N" turns, each with the same inclination angle, then the total twist is Nsin a . This example will come in handy when we look at writhe, "W". Writhe is a measure of the coiling of a superhelix, and it is a geometric, structural property like twist and subject to changes as deformation occurs. If you know both L k and T, then W equals L k -T. When the helical axis lies in a plane, as in linear or curved DNA or in closed circular DNA, or when it lies on the surface of a sphere, then L k = T and W=0. The following three experiments will give an intuitive understanding of writhe. (10) Wrap a wide rubber band around a cylinder with a screwtop lid that is in a closed position. You might have to secure the band to the top and bottom of the cylinder with tape. What is the L k ,T, and W? Now, twist the top counterclockwise a turn or two. Try to compensate for the deformational effect of the friction between the band and the container. What is L k ? Try to estimate the value of T (you will need " a ") and then W. (11) A thought experiment: A telephone cord in its relaxed state has its helical axis twisted into a solenoid (a coiled coil). This is almost all writhe and almost no twist. Stretch the cord so that the axis is almost straight. Now, there is almost no writhe but mostly twist. By now, you should have a sense that twist is a measure of deformation due to a twisting motion, while writhe is a measure of bending . When the linking number is reduced in closed circular DNA, the molecule supercoils by minimizing twisting and bending. The following experiment is a classroom demonstration, which you can also do on your own. (12) Take a 3 or 4� length of flexible rubber tubing and lay it flat on the table. In this state, there should be no twist to the tubing. With a colored marker, draw a line on the top of the tube for its entire length. Draw a similar line opposite to that line (on the bottom surface of the tubing). These two lines will represent the 2 strands of DNA. Now wrap the tubing around a cylinder for a number of turns in a left-handed direction. One thing that might happen is that the tube twists as you wrap it. We don�t want this to happen, so we can relieve most of the twist by wrapping as horizontally as we can and by keeping the set of lines that we drew oriented parallel to the cylinder wall. Attach the loose ends with tape, making sure to minimize the twist. The lines that you drew should run into each other. The structure that you made is called a left-handed solenoidal superhelix. Carefully remove the solenoid from its template, the cylinder. What you will observe is that, when removed from the cylinder, it jumps into an interwound right-handed superhelix. If there were "N" turns in the left-handed solenoid, then there will be N/2 upward turns and N/2 downward turns in the interwound superhelix. If the helical pitch, a 1 , is small for the solenoid, it will be large ( a 2 ) for the interwound superhelix. What is the linking number for each of the configurations? What approximations can you make for the twist and writhe for each? In non-dividing eukaryotic cells, chromosomal DNA is wrapped around a nucleosome core which consists of highly basic proteins called histones. This is the fundamental unit of organization of chromatin and the individual nucleosomes are regularly arranged as "beads on a string" connected by linker DNA. The DNA is wrapped around the nucleosome in a left-handed solenoidal arrangement. This negative supercoiling is one of the forms taken up by underwound DNA. How can eukaryotic DNA be underwound if they do not have enzymes that decrease the linking number? This can be demonstrated as follows: (13) Take a large rubber band, of medium width and lay it flat on the table top. It should assume a relaxed state and thereby represent closed circular DNA. Take a sphere about the size of a ping pong ball, and wrap the rubber band around it in such a way that you make a left-handed solenoid of one complete turn in which there is no twist (the rubber band lies flat against the sphere). Notice that (1) you have not altered the L k of the rubber band and (2) a compensatory positive supercoil has occurred in the free part of the rubber band. Eukaryotic topoisomerases can relax positive (and negative) supercoils. If you cut through the rubber band and reattach the ends after removing the supercoil , you will be left with a negative supercoil fixed to the nucleosome, and a change in L k of �1. Further explore the structure of the nucleosome . We mentioned Z-DNA previously, and commented that reversible transitions of DNA segments, under appropriate conditions, between their B- and Z-forms, may be important in regulation of gene expression. Each change of a right-hand turn to a left-hand turn would be associated with the removal of two supercoils. The resulting, more relaxed state of the B-DNA that remains, would present previously unaccessible segments of DNA to potential interactions with proteins. A Look at Topoisomerases in More Detail Link here to see Power Point presentation on DNA Structure: DNA STRUCTURE Link here to see Power Point presentation on Topoisomerases : Slide 1 Link to Study Questions: StudyQwsDNATop.doc In the next lecture, we will study DNA-protein interactions. Such interactions are very important when looking at regulation of DNA transcription and replication. I'll assume that you know the basics regarding both DNA transcription onto mRNA and DNA duplication. Link here to get to the DNA-protein Interaction Page: DNA-protein Interactions