Design principles of ssDNA or RNA knots

To create ssDNA knots, we used DNA parallel crossover (PX) motifs as the modular building blocks and a node-edge network as the geometric blueprint for arbitrary nanostructures, as they can be readily connected with other PX motifs to enable single-stranded routing. In comparison with the compact parallel or antiparallel helical arrangements, wireframe networks are better candidates for constructing knotted structures, as they offer more space for DNA chains to thread through during the early formation of partial structures. For example, knot 9 1 (Alexander–Briggs notation) can be assembled by either connecting nine right-handed X-shaped junction tiles together (Fig. 1a) or by threading a single chain through itself nine times (Fig. 1b). Here, we used two parallel crossovers that were separated by four or six base pairs to form an X-shaped topology, which represented one cross node in a knot (Fig. 1c). We arranged the 9 × X-shaped DNA tiles into a square with 2, 2, 2, and 3 crossings on the four edges, respectively (Fig. 1d), and separated the adjacent PX junctions by one turn (10 or 11 bp) or two turns (21 bp) of the dsDNA. After connecting the nearest DNA strands and adding small linking structures at the four vertexes, the resulting design consisted of only one long ssDNA (Supplementary Figure 1 and Fig. 1d). The overall routing of the ssDNA can be treated as a two-step process. First, half of the DNA chain folded back to partially pair with the other half of the DNA, leaving several unpaired single-stranded regions (~4–6 nts) in between the perfectly paired regions (~10, 11, or 21 bps). Then, the unpaired regions matched to each other by paranemic cohesion interactions and finally knotted into the target topology (Fig. 1d).

Fig. 1 Design of single-stranded DNA (ssDNA) or RNA knots. A knot with a crossing number of nine can be constructed via two strategies: a Assemble with preformed individual X-nodes that are linked together with specific sticky ends associations and ligation. b Thread and knot a single chain into the target topology. The color scale shows different regions of a single chain. c Paranemic crossover motifs were introduced as the building blocks for the knotted nucleic acid nanostructures. d A schematic diagram that shows the design and folding pathway of a ssDNA to form knot 9 1 as an example (The color scale shows different regions of ssDNA/RNA). A single-chain DNA was assigned with partially paired regions to first form a large loop-stem hairpin structure. Then, the unpaired loop regions were designed to interact with each other through paranemic cohesions to form the target knot. The formation of the knot involved the threading of the two ends of the loop-stem structure by following a pathway similar to that shown in b Full size image

Designing a topologically and kinetically favorable folding pathway is a key step for the successful formation of intricate structures with high crossing numbers. We introduced a hierarchical folding strategy to guide the knotting process in a prescribed order. For example, a knot with 23 crossings could be assigned to a location on a three-column grid that is represented by a rectangle with three square cavities (Fig. 2c). To maintain the structural stability and rigidity of each edge, we limited the length and number of crossings on each edge: in any six turns (63 bp) of a double helical DNA, only two or three PX crossings were allowed. In total, 23 crossings were assigned on the 10 edges, in which seven edges had two crossings and three edges had three crossings (Fig. 2c). The order of the folding pathway could be designed in many possible ways (Supplementary Figure 2). We listed all of the possible combinations for the formation order of the crossings in the knot and compared their routing pathways (Supplementary Figure 2).

Fig. 2 Design and AFM characterization of two-dimensional single-stranded DNA and RNA knots. Designer models (top row) for the 2D nanostructures and their corresponding AFM images (middle row shows the zoomed-in images and bottom row shows the zoomed-out images) with increasing crossing numbers: a A DNA square with a crossing number of 9, b An RNA square with a crossing number of 9, c A DNA rectangle with three square cavities and a crossing number of 23, d A DNA 3 × 3 square lattice with a crossing number of 57. The color scale in the schematics indicate the routing of the large long-stem structures. The scale bars in the zoom-in images represent 50 nm, while the ones in the zoom-out images are 200 nm Full size image

Three essential rules were identified for optimizing the folding pathway. First, a linear folding path is better than a branched one, because the linear folding pathways involve two free ends that thread to form the loops in a sequentially ordered pathway, while the branched folding pathways have parallel steps that each involves a single free end to thread through the preformed loops. Based on an entropic point of view, the formation of two free ends looping with each other is expected to be possibly easier than one free end threading itself through preformed loops (Supplementary Figure 3), agreeing with the previously reported simulation theory25. Second, in the early stage of folding when the unfolded portion of the strand is still long, the folding pathway should avoid threading DNA strand through any of its own preformed structures (Supplementary Figure 4). Third, the edges with three crossings should generally fold before the edges with two crossings, since the cohesion force provided by three paranemic interactions (totally 12–18 bp) is expected to be stronger than that from two paranemic interactions (totally 8–12 bp) with random sequence designs. If a two-crossing edge needs to be formed before three-crossing edges, the possible length design of the paranemic cohesion region is 12 base pairs (we will further discuss the sequence assignment in the following text). From the crossing positions and the grid layouts, the route for the chain threading is determined by specifying the order in which the chain visits each vertex and knots on each edge. Figure 3a, Supplementary Figure 4, and Supplementary Table 1 show the selected folding pathway for the three-column grid knots. We compared the folding yield of the two types of scaffold routings by AFM imaging (Supplementary Figure 5). The selected linear folding pathway (Supplementary Figure 4a) produced 57.9% (N = 214) well-formed structures, while the branched one (Supplementary Figure 4b) showed a yield as low as 0.9% (N = 221).

Fig. 3 Optimization of the folding pathway for ssDNA knots. a A designed model shows the selected best folding pathway (from one to seven) for the three-square structure with 23 crossings, by following our optimization rules. The red to gray color scale as well as the number one to seven represent the order of the paranemic cohesion interaction strengths on the edges from high to low (one to seven), based on the number and length of the paranemic cohesions involved as shown in b. c The paranemic interaction regions can be designed with lengths of 4 bp or 6 bp with distinct expected binding strengths (6 bp > 4 bp). Therefore, one is able to guide the folding order of the knot structure by controlling the sequences and lengths of the paranemic interactions in each individual edge. We compared the folding efficiency of the known structures by using different folding pathways before and after optimization (d). The AFM images revealed a dramatic increase in the folding yield of well-form structures from 0.9% (N = 221) to 57.9% (N = 214) (e). The scale bars are 200 nm Full size image

The next step in our design procedure was to assign an appropriate sequence to enable the long ssDNA to create the structural and topological complexity. We established several criteria for generating a valid raw sequence: First, the ideal percentage of GC content in all regions of the DNA sequences was determined to be between 30 and 70%, since any GC content outside of this range would adversely affect the DNA synthesis. Second, depending on the size of the ssDNA, every segment that was 6–8 bases long was treated as one unit, to help us evaluate the uniqueness of the DNA sequence. The specificity of recognition between the designed base pairings rely on the uniqueness of the DNA sequences. Third, the repeating length of G was limited to 4 nt. A raw sequence was obtained by using the inherent algorithm of the Tiamat software26, and in adherence with these rules. Then, the raw sequence was inspected manually and several modifications were made: The local sequences that were used to form the paranemic crossovers were checked to make sure that each of the crossovers were stable; the GC content in each of the paranemic cohesion regions were designed individually and inter-dependently as they needed to be compared with one another. It was necessary that all of the paranemic cohesions would have a sequentially decreasing melting temperature, ordered according to the predetermined folding pathway. Lastly, the uniqueness of the paranemic cohesions was optimized independently, such that mismatches and cross-talking in the second step of the folding were minimized.

Synthesis of long ssDNA molecules

Both the chemical and enzymatic synthesis of long ssDNA molecules are technically challenging, because the chain possesses a large portion of self-complementarity. As shown in the folding pathway, the ssDNA molecule will first form a long hairpin-loop structure with the 5’ and 3’ ends meeting each other. We first split the full-length ssDNA strand into two equal halves, with each strand lacking significant secondary structures, we then inserted each of them into plasmids as double-stranded genes, and then amplified them by cloning. The two dsDNA genes were obtained separately from the plasmids by restriction enzymes digestion (EcoRI + XbaI and XbaI + HindIII, respectively) and were then ligated together with a linearized phagemid vector, pGEM-7zf(-) (Supplementary Figure 6a). In order to obtain the full-length ssDNA molecule, the recombinant M13 phage was replicated in E. coli with the assistance of a helper plasmid, pSB442327. Because the helper plasmid, pSB4423, does not contain a phage replication origin, only the phagemid vector containing the full-length ssDNA origami gene was able to act as a template for the phage DNA replication. After the extraction and purification of the recombinant phage DNA, EcoRV digestion was performed to cut out the target ssDNA (Supplementary Figure 6a). Native agarose gel electrophoresis was used to separate the target ssDNA from the phagemid vector ssDNA (Supplementary Figure 6b). Using this method, we synthesized and amplified all of the long ssDNA strands at a nanomole quantity (with 1 L scale of E. coli culture) and high purity. The ssDNA strand was obtained, then self-assembled (folded) in a 1x TAE-Mg buffer with a 12 h or 24 h annealing ramp from 65 °C to 25 °C (see details in Methods section). The folded products were then characterized by using AFM imaging, gel electrophoresis and/or cryo-EM imaging.

Complex knots with large crossing numbers

We applied our design procedures to create more complex DNA knots with increasing crossing numbers. A 3 by 3 square grid of DNA knots with 57 crossed nodes was designed with an optimized linear folding pathway (Fig. 2d and Supplementary Figure 7). Other geometric layouts were also used following the same design principles. A large molecular knot with 67 crossings in a hexagonal lattice was designed and constructed. High-resolution AFM imaging was used to characterize the structural formations (Supplementary Figure 8). It is noted that the smaller knot structures with crossing numbers 9 and 23, folded well with yields as high as 69% (N = 103) and 58% (N = 214), respectively (Fig. 2a, c and Supplementary Figure 9 and 5). However, as the crossing number of the knot increased to 57 or 67, the folding yield dropped significantly and in the images, only 1.2% of the resulting structures were perfectly formed with the 57 crossings (N = 327) and none of the resulting structures were perfectly formed with the 67 crossings (N = 389) based on single-molecule analysis by AFM imaging (Fig. 2d and Supplementary Figures 7, 8). Almost every one of the formed structures we examined showed some degree of various folding defects (Supplementary Figure 8). With such a high complexity, even the hierarchical folding optimization did not significantly increase the overall yield (Supplementary Figure 10 and Supplementary Table 2). The folding behaviors in our ssDNA knots were remarkably different from that of the classic DNA structures. To make the target knots, the ssDNA chain needed to fold following an exactly defined order. If one crossing was misfolded in an earlier stage, it would be impossible (or at least extremely difficult) for it to correct itself afterwards. Nevertheless, the yields of those knots were not surprisingly low when compared with the yields of the chemical synthesis reactions that contained multiple steps. If we treated the formation of one crossing as one knotting step, the average yield for each knotting step could be estimated to be at least 90%, and in the low crossing number cases, the single step yield was as high as ~96%.

Topological control to validate the knotting configuration

As most of our high crossing number ssDNA knots were characterized by high-resolution AFM imaging (Fig. 2), one question was raised: whether the formation of the defected nodes in the final knot structure could be truly and completely identified with AFM imaging. We designed a link structure with eight nodes as a topological control (Supplementary Figure 11). This link structure contained two dsDNA rings (each only partially complementary), which connected to each other through eight paranemic cohesions. We constructed this particular structure by annealing two linear dsDNAs (each preformed from two ssDNAs) with eight stretches of mismatches (bubbles), each consists 6 nt (Supplementary Figure 11). The mismatches within these two linear dsDNAs would interact with their counterparts to form the stable paranemic cohesions. Nine-basepair sticky ends that extended from both ends of the dsDNAs closed the two rings after the formation of the correct structure. This link structure assembled well, as characterized by high-resolution AFM imaging (Supplementary Figure 11a). On the contrary, if the two linear dsDNAs were first ligated to form closed ring structures, we reasoned that these two dsDNA rings would not be able to assemble into the desired fully inter-locked loop structure. As expected, although the two dsDNA rings could still bind with each other partially through some of the paranemic cohesion interactions (Supplementary Figure 11b), extensive defects were observed in all of the structures and unknotted structures can easily be scratched/deformed by AFM tips during scanning (Fig. 3, Supplementary Figure 11, and Supplementary Table 3), indicating the intact structures shown in Fig. 3a were knotted structures following the design.

Design and construction of ssRNA knots

Our design strategies for ssDNA knots can be adapted to create ssRNA knots. Although knots do not exist in all of the naturally occurring RNA structures discovered to date, they might be important factors in the early stages of evolution on earth, because knotted entanglements may be capable of conserving the spatial information of the RNA network, without involving covalent bonds in a harsh environment28. We designed an X-shaped RNA modular building block, which was similar to the PX structures of DNA. We followed the same steps for constructing the ssDNA knots. First, based on the 3D modeling of an A-form dsRNA helix (11 bp per helical turn, 19 degree inclination of base pairs) and the best geometric fitting, 8 (instead of 4 or 6) bp was chosen for the length of a paranemic crossover (Supplementary Figure 12). For an eight base-pair paranemic cohesion, a total of 48 = 65536 possible sequences provided an adequate sequence space for the selection of unique complementarity to sufficiently avoid undesired interactions between the PX motifs. Second, given the 11 base pairs per turn of an A-form dsRNA, we assigned the lengths of the inter-motif stems as alternating between 8 and 9 bp (Supplementary Figure 12) to achieve a structural repeating unit of 33 bps for three full helical turns (i.e., 8 bp PX + 8 bp stem + 8 bp PX + 9 bp stem = 33 bp = 3 full turns). In this design, the neighboring structural units were in line with each other without accumulating helical twist, and the final assembled structure was expected to stay in 2D. Like the ssDNA 9 1 knot, we assigned 2, 2, 2, and 3 crossings on the four edges of a square and looped the vertexes to form one single-stranded RNA (Fig. 2b). After generating the appropriate sequences by following the same sequence design rules as the ssDNA structures, the dsDNA gene coding for the long RNA strand was first synthesized, and then the ssRNA molecule was obtained by an in vitro transcription reaction. After annealing, the AFM images revealed the successful formation of the ssRNA 9 1 knots (Fig. 2b and Supplementary Figure 12).

Producing 3D ssDNA knots

The design procedures presented here can also be applied to create 3D architectures with arbitrary geometries. We demonstrated the versatility of our method by constructing four ssDNA polyhedral meshes: a tetrahedron, a square pyramid, a triangular prism, and a pentagonal pyramid with crossing numbers 15, 20, 22, and 25, respectively (Fig. 4). A Schlegel diagram was used to transfer the 3D objects to their topologically equivalent 2D nets. Optimized folding pathways were designed carefully for step-wise hierarchical assembly and the corresponding ssDNA strands were designed, synthesized, and assembled. AFM images showed an abundance of well-folded 3D nanoparticles with the expected sizes (Supplementary Figures 13–16 and Supplementary Tables 4–7). Single particle cryo-EM 3D reconstruction revealed that the overall conformations matched the designed geometries well (Fig. 4, Supplementary Figures 17, 18, and Supplementary Table 8). Notably, the vertex design for our ssDNA knots was different from the multi-arm junction design based on the double-crossover (DX) motif29,30. Instead, there is a 5 bp difference in length between the two parallel dsDNAs that form each edge, leading to chiral vertices and inclined edges in the ssDNA knots (Fig. 4). This unique geometric feature could be used to identify conformational diastereomers, which is when the structure is turned inside out, while satisfying all programmed Watson–Crick base pairing with the same network connectivity. The 3D reconstruction data suggested that the ssDNA knots preferred to point the major grooves inwards at the vertices. A similar feature had been previously observed for the wireframe DNA nanostructures31,32.