The idea for designing and synthesizing a eukaryotic chromosome was initiated by our group in collaboration with Jef Boeke in 2005. The concept for hierarchically synthesizing a designer yeast chromosome was quite simple. First, design the synthetic chromosome incorporating all the desired changes based on the available wild-type chromosome sequence of S. cerevisiae. Second, compile the designed chromosome into pieces of about 10 kbp by including unique restriction sites at the 5′ and 3′ ends to enable further ligation of the 10 kbp pieces into segments of about 30–50 kbp. Synthesize these pieces of about 10 kbp using oligonucleotides from commercial vendors. Third, as yeast is highly recombinogenic, use an iterative strategy with alternating genetic markers to replace each 30–50 kbp segment of the wild-type sequence with the corresponding synthetic pieces, one at a time by homologous recombination in vivo in yeast.

The initial proof-of-principle experiment was performed in our laboratory by first designing and synthesizing a 30 kbp fragment of yeast chromosome III and then replacing the wild-type segment with the synthetic piece in yeast [29]. By 2007, the idea of synthesizing a eukaryotic chromosome had morphed into an ambitious project with the goal of rewriting wild-type S. cerevisiae Sc1.0 into a synthetic version, Sc2.0.

Design principles for the synthetic yeast genome (Sc2.0)

Suggestions for the types of changes to be incorporated into Sc2.0 were obtained by Boeke from the community of yeast researchers. Only conservative changes were included, as more drastic changes might result in ‘dead’ yeast. The synthetic yeast should have the same fitness as the wild type and grow normally; this is an obvious minimal requirement for Sc2.0. The three design principles for the synthetic yeast genome are as follows: (1) it should result in a (near) wild-type phenotype and fitness; (2) it should lack destabilizing elements to avoid the synthetic yeast genome from being unstable or undergoing rearrangements; (3) it should have genetic flexibility to facilitate future studies [30].

How does one design a Sc2.0 genome that will facilitate future studies? Yeast contains about 6000 genes and almost 5000 of these are non-essential when disrupted individually [31]. As such, all the non-essential genes were flanked with loxPsym sites. Once a synthetic chromosome or the Sc2.0 genome is built, in theory, one could expose the synthetic yeast strains to Cre recombinase for various time intervals and look for survivors. PCR-Tag analysis (see synIII construction) and sequencing of the genomes of survivors would reveal what combinations of non-essential genes have been deleted from the starting Sc2.0 genome, leaving the survivors viable.

synIII design

After a successful proof-of-principle experiment involving the design of a synthetic 30 kbp chromosome III fragment that was used to replace the native sequence in yeast, the sequence of the whole native chromosome III was edited in silico using Biostudio [32] to incorporate a series of deletions, insertions and base substitution changes to produce the desired ‘designer’ sequence (Box 2 and Fig. 3a). The synthetic version of chromosome III (known as synIII) also encodes a built-in recombination system called SCRaMbLE (synthetic chromosome rearrangement and modification by loxP-mediated evolution) to enable removal of the non-essential parts of the chromosome, and therefore streamline it, by inducing genomic alterations of the synIII strain using Cre recombinase [32]. As the result of these alterations, synIII (272,871 bp) is about 13.8 % smaller than the native chromosome III (316,667 bp) [32].

Fig. 3 synIII design and synthesis. a synIII design. Twenty-one retrotransposons (RT) and seven introns were removed. Forty-three TAG stop codons were changed to TAA stop codons. Ninety-eight loxPsym sites were introduced to enable SCRaMbLE analysis. The two natural telomeres were replaced with shorter universal telomere caps. A single copy of essential tRNA gene SUP61, which codes for tRNASer (CGA), was deleted and moved to a tRNA neochromosome. Numerous PCR-Tags were incorporated into synIII to distinguish it from the natural counterpart. As a result, synIII is about 13.8 % smaller than the native yeast chromosome III (Box 2). For the complete set of additions, deletions and other genome modifications to synIII, see Annaluru et al. [32]. b synIII synthesis. synIII was constructed in three steps (shown in the flow diagram on the left, from bottom to top). In step 1, 750 bp building blocks (BB) were synthesized from 60-mer oligonucleotides at Johns Hopkins University by undergraduate students in the Build-A-Genome course [33]. In step 2, three to five BB were assembled into 2–4 kb minichunks by homologous recombination in Saccharomyces cerevisiae [35]. Adjacent minichunks were designed to encode overlap of one BB to facilitate downstream assembly. In step 3, direct replacement of native yeast chromosome III with pools of synthetic minichunks was performed. Eleven iterative one-step assemblies and replacements of native genomic segments of yeast chromosome III were carried out using pools of overlapping synthetic DNA minichunks, encoding alternating genetic markers (LEU2 or URA3), which enabled complete replacement of native III with synIII in yeast [32]. The number of oligonucleotides, BBs, and minichunks needed to construct synIII are shown in parentheses. SynIII is 272,871 bp long, compared with the 316,667 bp long native yeast chromosome III Full size image

synIII construction

The hierarchical workflow that was used to construct synIII (Fig. 3b) consisted of three major steps. In the first step, the 750 bp ‘building blocks’ (BBs) were produced starting from overlapping 60-mer to 79-mer oligonucleotides and assembled using standard PCR methods [33]. In a second step, the BBs were assembled into overlapping DNA ‘minichunks’ of approximately 2–4 kb using either the uracil-specific excision reaction [34] or cloning into a shuttle vector by homologous recombination in yeast S. cerevisiae [35–39]. In the USER approach, four to five BBs are used that each have a 5–13 bp sequence of the type A(N) 3 T to A(N) 11 T that overlaps with their adjoining neighbors and a vector. These BBs are amplified using forward and reverse primers containing a single uracil instead of the T and are then treated with USER enzymes (a mixture of uracil DNA glycosylase and the DNA glycosylase-lyase endonuclease VIII) to generate complementary single-stranded ends. The BBs are then ligated and cloned into E. coli to recover recombinants containing the assembled ‘minichunks’. The yeast homologous recombination cloning approach is much simpler, where four to five BBs each with 40 bp overlaps with their adjoining neighbors are assembled into a shuttle vector by direct transformation into the highly recombinogenic S. cerevisiae. This approach obviates the need for another round of PCR amplification of the BBs using primers containing uracil and the use of USER enzymes. Thus, as it turns out, all you need is yeast for minichunk assembly. In the third and final step, the adjacent minichunks for synIII were designed to overlap one another by one BB to facilitate further assembly in vivo by homologous recombination in yeast. Using an average of 12 minichunks and alternating selectable markers in each experiment, the native sequence of S. cerevisiae III was systematically replaced by its synIII counterpart in 11 successive rounds of transformation. PCR-Tag analysis (Fig. 4) and sequencing confirmed the identity of synIII [32]. The fact that the numerous design changes to the DNA sequence of the chromosome III had little or no impact on cell fitness and phenotype suggests the very pliable nature of the yeast genome [32].

Fig. 4 PCR-Tag analysis of a synIII segment. a The YCL061C.3 locus-specific PCR-Tag forward (F) and reverse (R) primers for the wild type (WT) and synIII are shown. The changes between the two are shaded. PCR-Tags are short pairs of recoded segments used as genetic markers to verify introduction of a synthetic sequence and removal of native sequence. Pairs of 25–28 bp sequences about 500 bp apart were recoded with synonymous codons such that >33 % of the bases were changed; the first and last base PCR-Tag primers were coded to be different between the WT and synIII sequences. b Agarose gel profiles of PCR-Tag analysis of a WT DNA segment and the corresponding synthetic synIII segment (YCL061C.3 to YCL050C.1). A virtual gel image was generated using LabChip GX software version 4.0.1418.0 Full size image

International consortium to synthesize the Sc2.0 genome

A group of international scientists has taken up the synthesis of the Sc2.0 genome. The Beijing Genome Institute in China was the first to agree to synthesize four of the yeast chromosomes. Since then laboratories from various other countries have also joined the Sc2.0 effort to synthesize the remaining yeast chromosomes. Each participating laboratory is required to sign an Agreement with Johns Hopkins University (now with New York University). This arrangement leaves the control of the Sc2.0 project to Boeke, who is a yeast expert. Such a central organization is needed for the coordination of a huge undertaking such as Sc2.0 and for the distribution of yeast strains, reagents and experimental protocols. Participating laboratories have to raise their own funds from their own country to synthesize the allotted chromosome.

What’s next for the yeast synthetic genome?

The synIII chromosome is about 2.5 % of the yeast genome and the changes that were made were all conservative, although numerous. These sequence alterations have not reduced the fitness of the yeast, which is encouraging in terms of the potential for future modifications. There are about 98 loxPsym sites in synIII, which scales to about 4000 loxPsym sites for the entire Sc2.0 genome. It is not yet clear how all of these loxPsym sites along with all the other modifications will ultimately affect the stability of the Sc2.0 genome and the viability of the synthetic yeast cell. The results from synIII are encouraging and the synthesis of a few more chromosomes will give us a better idea. Boeke’s laboratory is working on the assembly of the synVI chromosome using fragments of approximately 10 kbp from commercial vendors. Our laboratory is in the process of completing the assembly of the synIX chromosome. With the experience gained from the synthesis and assembly of synIII, we estimate that the construction of a chromosome about 1 Mbp could be done in 2–3 years.

Once the Sc2.0 genome is built, an important focus will be to determine the minimal eukaryotic (yeast) genome. If two or more genes perform a similar function, can one be deleted? Which combinations of the 5000 yeast non-essential genes that are dispensable individually can be simultaneously removed? If we possess this knowledge, we will be able to achieve further reduction in the size of the Sc2.0 chromosomes and the genome. The plan is to use SCRaMbLE analysis to arrive at the minimal yeast genome (Fig. 5). This approach involves exposing Sc2.0 to Cre recombinase for various time intervals and looking for survivors. We reason that PCRTag analysis and sequencing the genomes of the survivors would reveal what combinations of non-essential genes have been deleted from the starting synthetic genome, leaving the survivors viable.

Fig. 5 Synthetic chromosome rearrangement and modification by loxP-mediated evolution (SCRaMbLE) of the synIII strain. Examples of inversion, translocation and deletion products resulting from Cre recombinase treatment of synIII strain are shown Full size image

This pathway to the minimal yeast genome would represent a ‘top down’ approach since we start from the entire newly designed Sc2.0 genome and progressively delete increasing parts of the genome. However, to complicate matters, the essential and non-essential genes of the synthetic yeast are interspersed with one another. Because of this intertwining, SCRaMbLEing of the Sc2.0 is likely to result in dead yeast most of the time. Only yeasts with small deletions are likely to survive, making it difficult to deduce the minimal genome. Furthermore, due to the inherent symmetry of loxPsym sites, when two such sites are brought together by a Cre recombinase, it could result in an insertion, a deletion, an inversion or a translocation (Fig. 3). Moreover, there is also the possibility of interchromosomal rearrangements through the loxPsym sites, in addition to the expected intrachromosomal deletions, inversions, insertions and rearrangements. Analysis of such widely variant genomes from a population of survivors would involve time-consuming costly experimentation and complicated data analysis to decipher the minimal yeast genome. This hurdle could be overcome to some extent by performing SCRaMbLE analysis at the level of intermediate yeast strains, each possessing an individual synthetic chromosome. Thus, one could delineate a set of 16 minimal chromosomes for yeast. All of the reduced yeast chromosomes could then be combined into a final yeast strain to form a minimal eukaryotic genome.