Designing TALENs targeting the soybean FAD3 genes

Soybean oil is primarily composed of palmitic acid, stearic acid, oleic acid, linoleic acid and linolenic acid. In oil from wild type plants, these five fatty acids are present at approximately 13, 4, 20, 55 and 8 %, respectively (Fig. 1a). Previously, we used TALENs to generate knockout mutations within both FAD2-1A and FAD2-1B genes [14]. Oil from the resulting plants contained higher levels of oleic acid (~79 %) and lower levels of linoleic and linolenic acids (~5 % each), compared to oil from WT plants. Here, we sought to further improve oil characteristics by knocking out genes involved in the conversion of linoleic to linolenic acid. We predicted that by knocking out the FAD3 linoleate desaturase genes, levels of linolenic acid would further decrease.

Fig. 1 Design of TALENs targeting FAD3 genes within Glycine max. a Illustration of the fatty acid pathway. Relative percent composition of individual fatty acids in the oil from WT and fad2-1 knockout plants is shown on the right. b Schematic of the FAD3A genomic sequence. Triangles, approximate TALEN binding sites; black boxes, exons; gray boxes, 5’ and 3’ untranslated regions. c Nucleotide sequences of the predicted TALEN target sites within the FAD3A, FAD3B, and FAD3A genes. Bold and underlined nucleotides indicate TALEN binding sequence. Lower case nucleotides indicate positions of SNPs. d Illustration of a TALEN monomer expression vector. P NOS , nopaline synthase promoter; SV40 NLS, simian virus 40 large T-antigen nuclear localization signal; T NOS , nopaline synthase terminator; AmpR, ampicillin resistance gene Full size image

The soybean genome contains three linoleate desaturase genes: FAD3A (Glyma14g37350), FAD3B (Glyma02g39230) and FAD3C (Glyma18g06950). In terms of nucleotide similarity of their coding sequences, FAD3A shares 96.2 % identity to FAD3B and 14.4 % identity to FAD3C. Compared to FAD3B and FAD3C, mutations in FAD3A confer the greatest decrease in linolenic acid levels in soybean oil (from ~8 to ~4 %) [22], which is consistent with higher expression of FAD3A within developing seeds [15]. Therefore, we sought to design TALENs that primarily recognize FAD3A sequence. Three TALEN pairs were synthesized which recognize sequence within exon two or exon three of FAD3A (designated as GmFAD3_T01.1, GmFAD3_T02.1, and GmFAD3_T03.1) (Fig. 1b). TALENs were designed to recognize FAD3A sequence which is partially conserved between FAD3B and FAD3C; however, the recognition sequences for all TALEN pairs at FAD3B and FAD3C contained at least one single-nucleotide polymorphism (SNP), but up to 11 SNPs, when compared to the FAD3A sequence (Fig. 1c).

Assessing TALEN activity in protoplasts by deep-sequencing

To determine TALEN activity, protoplasts were transformed with plasmid DNA encoding each TALEN pair and the FAD3 target sites were deep-sequenced. To this end, approximately 500 000 protoplasts were transformed with 15 μg each of two plasmids encoding a complete TALEN pair. Protoplasts were transformed using polyethylene glycol. Genomic DNA was isolated ~48 h post transformation and used as a template in a PCR with primers designed to individually amplify TALEN target sites within the FAD3A, FAD3B or FAD3C gene. Amplicon pools for each TALEN target site were sequenced by 454 pyrosequencing. For all three TALEN pairs, we observed evidence of NHEJ mutations in two of the three FAD3 genes (Fig. 2a). TALEN pair GmFAD3_T02.1 introduced mutations within both FAD3A and FAD3B, and, relative to the other TALEN pairs, had the highest activity at its intended target sequence, FAD3A (16.0 %). On the other hand, TALEN pair GmFAD3_T03.1 had the lowest activity at its intended FAD3A target sequence (4.9 %). Activity of all three TALEN pairs at the FAD3B and FAD3C target sites was lower than the respective FAD3A target site, which is most likely due to SNPs within the FAD3B and FAD3C TALEN binding sites.

Fig. 2 FAD3 TALEN activity in soybean protoplasts. a TALEN pairs were assessed for their activity ~48 h after transformation in soybean protoplast. The frequency of mutagenesis represents the total number of sequence reads with insertions or deletions divided by the total number of sequence reads. The resulting number was then divided by the transformation frequency (90 %) which was determined using a YFP control plasmid. b TALEN activity relative to the number of SNPs present within the predicted TALEN binding sites Full size image

We observed a correlation between the number of SNPs within TALEN binding sites and the relative mutation frequencies (Fig. 2b). Mutation frequencies at FAD3A target sites (containing 0 SNPs) for TALEN pairs GmFAD3_T01.1, GmFAD3_T02.1, GmFAD3_T03.1 were 11.2, 16.0 and 4.9 % respectively. After normalizing TALEN mutation frequencies at FAD3A, the relative mutation frequencies at FAD3B and FAD3C were determined. Target sites with one or two SNPs decreased mutation frequencies to ~53 or 63 %, respectively, relative to the activity of the corresponding TALEN FAD3A; target sites with four SNPs decreased mutation frequencies to 14 %; target sites with five SNPs decreased mutation frequencies to 0.041 %, and target sites with >5 SNPs decreased mutation frequencies to undetectable levels. Whereas these data do not account for relative position of the SNPs, they provide evidence for TALEN target site specificity, indicating that target sites with five or more SNPs are unlikely to be recognized and cleaved.

Generating soybean plants with FAD3 mutations

To generate soybean plants with knockout mutations in FAD3 genes, DNA encoding TALEN pair GmFAD3_T02.1 was stably integrated into the soybean genome [14, 23]. Both WT and fad2-1a fad2-1b mutant soybean lines were transformed; from four independent transformations, a total of 72 events were generated (Table 1). To detect TALEN-induced mutations, the FAD3A gene was amplified and digested with T7 endonuclease I. We observed that 16 of the 72 events had cleavage products consistent with mutations within the GmFAD3_T02.1 target sequence. Cloning and sequencing of FAD3A amplicons revealed that all 16 plants harbored short deletions within the TALEN spacer sequence, ranging from 4 to 135 bp. Together, these results confirm the successful mutagenesis of FAD3A within T0 soybean plants, with a mutagenesis frequency of ~22 %.

Table 1 Summary of FAD3A mutation frequencies within T0 soybean plants Full size table

To confirm TALEN-induced mutations can be stably transmitted to subsequent generations, candidate T1 plants derived from experiment Gm183 were screened for mutations within FAD3A by PCR amplification and sequencing of clones. From three different T0 events (Gm183-4, Gm183-5 and Gm183-6), we identified T1 plants harboring heterozygous or homozygous mutations within FAD3A, indicating that mutations were stably transmitted to the next generation (Table 2). Further, we assessed T1 plants by PCR for the presence of transgene sequence. Of the 25 T1 plants assayed, 20 were positive for transgene sequence and five were negative (i.e., null segregant for the TALEN transgene). Importantly, two of the five transgene-free T1 plants harbored mutations within FAD3A. These two plants were self-pollinated to produce homozygous-mutant, transgene-free fad2-1a fad2-1b fad3a soybean plants. Notably, we also identified a single-gene fad3a knockout T1 plant from experiment Gm184 (identified as Gm184-3-20) which contains a homozygous −4 bp deletion within FAD3A. We failed, however, to identify plants with combinations of FAD3A and FAD3B mutations, indicating that the frequency of mutagenesis at FAD3B was <1.4 % (i.e., less than 1 out of 72 events).

Table 2 Genotype of T1 plants from candidate T0 events harboring mutations within FAD3A Full size table

Oil from fad2-1a fad2-1b fad3a homozygous mutant soybean seeds contains high oleic, low linoleic and low linolenic acid

Next, we assessed the oil profile within seed from fad3a and fad2-1a fad2-1b fad3a homozygous mutant soybean lines (Fig. 3). Seed from T1 homozygous mutant lines were collected and assessed for oil composition by gas chromatographic analysis of fatty acid methyl esters (GC FAME Analysis; Additional file 1). In oil from fad3a plants, we observed significant changes in linolenic, linoleic, oleic and stearic acid levels, relative to oil from WT plants. We observed linolenic acid decreased from 8.2 ± 0.4 to 3.9 ± 0.3 %, linoleic acid increased from 51.1 ± 0.2 to 61.9 ± 1.2 %, oleic acid decreased from 23.2 ± 0.8 to 17.9 ± 1.6 % and stearic acid decreased from 4 ± 0.01 to 3.2 ± 0.1 %.

Fig. 3 Fatty acid profile from fad2-1a fad2-1b fad3a soybean plants. Oil from T2 seed from four different T1 fad2-1a fad2-1b fad3a mutant lines was analyzed. The genotypes for the fad2-1a fad2-1b fad3a plant lines at the fad3a TALEN target site were −7 bp/-7 bp (Gm183-4-3), −43 bp/-43 bp (Gm183-5-4), −43 bp/-43 bp (Gm183-5-5), and −43 bp/-43 bp (Gm183-5-9). The genotype for the fad3a plant line was −4 bp/-4 bp (Gm184-3-20). Error bars represent standard deviation of the oil levels within individual seeds, specifically, five seeds for Gm183-4-3, five seeds for Gm183-5-4, five seeds for Gm183-5-5, five seeds for Gm183-5-9, five seeds for Gm184-3-20, four seeds for WT, and 20 seeds for fad2-1a fad2-1b Full size image

We observed significant changes in fatty acid levels within seed oil from fad2-1a fad2-1b fad3a soybean plants, when compared to fad2-1a fad2-1b soybean plants. The average linolenic acid level within oil from fad2-1a fad2-1b fad3a plants was 2.5 ± 0.4 %, significantly lower than oil from fad2-1a fad2-1b soybean plants (4.7 ± 0.1 %). Linoleic acid levels decreased from 5.1 ± 0.7 % in fad2-1a fad2-1b lines to 2.7 ± 0.9 % in fad2-1a fad2-1b fad3a lines, and oleic acid levels increased from 77.5 ± 0.8 % in fad2-1a fad2-1b lines to 82.2 ± 1.6 % in fad2-1a fad2-1b fad3a lines. Together, these results indicate that stacking mutations within FAD2-1 and FAD3A genes decreases linolenic and linoleic acid levels to below 3 %, and increases oleic acid levels to over 80 %.