Identification of non-synonymous mutations in pullulanase

S orghum b icolor PUL LULANASE (SbPUL, Sb06g001540) is annotated in the v1.0 proteome as a locus spanning ~14.6 kb exclusive of 5′ and 3′ regulatory regions22. Pullulanase exists as a single copy in the sorghum genome. No other starch metabolic genes were observed in the annotations of loci within 1 Mb of the pullulanase locus. We sequenced a 1.3-kb fragment in the 5′ genic region of 143 globally distributed accessions spanning modern breeding programme stock to landraces from traditional cropping systems. In silico analysis for missense mutations revealed polymorphisms, leading to two coinciding non-conservative amino-acid changes that depart from the peptide arising from the predicted Sb06g001540.1 gene model (www.phytozome.com)22. The polymorphisms encode G32R and D105A amino-acid changes that both reside in the N-terminal domain (Fig. 1). Furthermore, we identified the presence of a 24-bp deletion that occurs in all haplotypes carrying Arg and Ala residues at positions 32 and 105, respectively. Together these polymorphisms define the SbPUL-RA allele type. The SbPUL-GD allele type, in contrast to SbPUL-RA, does not possess the 24-bp deletion and encodes Gly and Asp residues at positions 32 and 105, respectively.

Figure 1: Pile-up diagram of cereal pullulanase peptide sequences of the N-terminal domain region. Black arrows highlight variant amino-acid positions between sorghum pullulanase allele types. Amino-acid tracks are given with pI tracks below; pI track bars are proportional to residue charge (positive is tall and blue; negative is low and black). From top to bottom the sequence tracks are: graphical representation of level of consensus identity, SbPUL-GD region with four amino acids found to differ from Sb06g001540.1 gene model (red underlay labelled 1) and the two polymorphic residues responsible for the GD to RA transition (red underlay labelled 2 and 3), SbPUL-RA, Zpu1 from Zea mays (NP_001104920), HvLD from H. vulgare (after Vester-Christensen et al.25), and OsPUL from Oryza sativa (CAE02111.2). Full size image

Comparisons of SbPUL against available sequences

cDNA sequence from SbPUL-GD grain was used to validate the incorporation of these variant codons in Sb06g001540 transcripts and to test the structure of the Sb06g001540.1 gene model. The cDNA sequence confirmed incorporation of the polymorphisms described above. First, we noted errors in the splice boundaries annotated for the Sb06g001540.1 5′ region, including the addition of 12 bp to the end of exon 1 and the omission of exon 2 altogether. The differences in exon structure changed the sequence of the predicted SbPUL-GD peptide into closer agreement with pullulanases from other cereals. However, SbPUL-GD differs significantly from barley, maize and rice at positions 32 and 105. Peptide sequences from loci, most homologous to Sb06g001540.1 in maize, rice and barley, highlight the deviations from these cereal pullulanases that the SbPUL-GD N-terminal sequence exhibits in MUSCLE alignments (Fig. 1)23. Polymorphic residues 32 and 105 in sorghum both occur in conserved islands of sequence among the cereals, but with residues 32 and 33 in barley being swapped in position relative to homologous positions in other cereals.

More importantly, we found that the SbPUL-RA form shares greater similarity at the amino-acid level to maize, rice and barley at residue 32/33 and is identical to other cereal pullulanases at residue 105 in comparison with SbPUL-GD. To further test the distinct nature of SbPUL-RA allele types, we assembled a gene tree from 42 resequenced lines and found that SbPUL-RA alleles diverge dramatically from SbPUL-GD (Fig. 2). Superimposing our sorghum gene models onto the Hordeum vulgare limit dextrinase (HvLD) three-dimensional structure revealed that these residues are solvent exposed and proximal to one another in the folded peptide (Supplementary Fig. S1)24,25. This implies that the SbPUL-RA allele product resembles other cereal pullulanases in these conserved regions both in primary sequence and possibly in peptide chemical properties within the region. These residues are exposed to the extra-molecular environment, increasing the likelihood of inter-molecular or functional significance for these polymorphisms. It has been proposed that the N-terminal domain, although poorly defined functionally, forms one-half of a pore, leading to the catalytic region in HvLD. In vivo support for this function, however, is lacking at present in cereals25. Identification of the variant SbPUL-GD allele type will allow further biochemical studies of sorghum pullulanase and may shed insight into the function of the N-terminal domain.

Figure 2: Gene tree displaying the well-defined clade of SbPUL-RA allele types. Gene tree of SbPUL sequences spanning the genomic sequence from the predicted start and stop of Sb06g001540.1. Insert windows in the upper right depict the polymorphisms of interest in exon 3 and exon 5. Full size image

SbPUL-RA confers increased digestibility in isogenic lines

To test whether SbPUL-GD and SbPUL-RA allele types differ in their PA and digestibility in a genetically homogenous background, we generated a F 6 near-isogenic line (NIL) by single-seed decent by selecting seeds from a heterozygous individual arising in each generation. Grain obtained from segregating genotypes was tested through four generations (F 3 , F 4 , F 5 and F 6 ) for PA using an assay that measures α 1→6 bond cleavage between maltriose subunits of the polysaccharide substrate. Overall, we found that SbPUL-RA grain possessed significantly higher PA than SbPUL-GD (analysis of variance (ANOVA): df=31, F RA−GD =17.62, P RA−GD =0.0002; see Fig. 3a and Supplementary Table S1). In fact, the PA of SbPUL-RA was found to be 67% greater than that of SbPUL-GD. Heterozygous grain, on the other hand, was statistically indistinguishable from SbPUL-GD and SbPUL-RA, (ANOVA: df=31; F RA−HET =2.62, P RA−HET =0.115; F GD−HET =2.77, P GD−HET =0.106; see Fig. 3a and Supplementary Table S1). Heterozygote PA values were between that of the homozygotes. The lack of significant difference between heterozygotes and homozygotes is probably because of the greater amount of variability in heterozygous lines. Taken together, the polymorphisms described above, the lack of any nearby annotated gene that may have an effect, and the pullulanase enzyme assay, all strongly support our hypothesis that SbPUL allele type underlies the effects described here.

Figure 3: Comparison of PA and ivD of the NIL and diverse genotype sets. (a) PA (n=8 of each genotype with two technical replicates) and ivD (n=3 of each genotype with two technical replicates) of NILs homozygous for either the SbPUL-GD or SbPUL-RA allele, or heterozygous at the SbPUL locus. (b) PA and ivD of dNIRS genotype set (n=36). For both figure panels y-axis units are miliunits of α 1→6 glucosidic bond cleavage activity and glucose released (unit total starch)−1 for PA and ivD, respectively. Dark grey and light grey bars are PA and ivD values, respectively. Error bars represent 1 s.d. from the mean. Full size image

Using the F 6 NILs, we then tested whether the pullulanase allele types would show differences in their digestibility. This was done using an in ivD assay that was designed as a proxy of the monogastric digestive system using porcine digestive enzymes26. Overall, we found that the SbPUL-RA grain showed significantly higher ivD than both the SbPUL-GD and heterozygote grain (ANOVA: df=5; F RA−GD =14.68, P RA−GD =0.018; F RA−HET =11.13, P RA−HET =0.028; F GD−HET =2.72, P GD−HET =0.174; see Fig. 3a and Supplementary Table S2). On average, we found that the ivD of the SbPUL-RA grain was 41% higher than that of the SbPUL-GD grain and 21% higher than that of the SbPUL heterozygote grain.

The SbPUL-RA phenotype is expressed irrespective of genotype

To ensure the reliability of the observed differences in PA and ivD between the two pullulanase allele types isolated in the NIL, we replicated the PA and ivD analyses using a panel of 36 accessions. These were selected on their diverse near-infrared spectra (dNIRS) of grain representing elite hybrid parents and landraces used in the Queensland sorghum breeding programme. We found that, on average, the genotypes carrying the SbPUL-RA allele type had significantly higher ivD than the genotypes carrying the SbPUL-GD allele type (general linear mixed model (GLMM) fitted with a binomial distribution and genotypes as random effects: PUL allele~ivD; df=34; estimate=0.35; s.e.=0.11; t=3.04; P=0.0045; see Fig. 3b and Supplementary Table S3). The average ivD of the genotypes carrying the SbPUL-RA allele type was 28% higher than the genotypes carrying the SbPUL-GD allele type. In contrast to the NIL, we did not find that genotypes carrying the SbPUL-RA allele type had significantly higher PA than the genotypes carrying the SbPUL-GD allele type (GLMM fitted with a binomial distribution and genotypes as random effects: PUL allele~PA; df=34; estimate=0.24; s.e.=0.13; t=1.83; P=0.075; see Fig. 3b and Supplementary Table S3). The average PA of the genotypes carrying the SbPUL-RA allele type was, however, 82% higher than the genotypes carrying the SbPUL-GD allele type. Furthermore, we did find that PA was significantly positively correlated with ivD (LMM with genotypes as random effects: PA~ivD; df=34; estimate=0.156; s.e.=0.047; F=10.76; P=0.0024). This suggests that, in comparison with ivD, PA had greater sensitivity to the wide genetic and developmental variability of the accessions included in the dNIRS set. Overall, our results demonstrate that sorghum accessions homozygous for the SbPUL-RA allele showed higher ivD than sorghum accessions carrying the SbPUL-GD allele, regardless of the genetic background.

SbPUL-RA occurs at a reduced frequency

Given the apparent advantage that the SbPUL-RA allele type has upon digestibility, it seems intuitive that such a beneficial allele type would be in the majority of cultivated sorghum. Surprisingly, we did not find this to be the case. In addition to sequencing 149 lines mentioned above, we genotyped 70 lines using primers amplifying across the associated 24-bp deletion in SbPUL-RA, for a total of 219 globally distributed genotypes from modern breeding programs and landraces. Indeed, the SbPUL-RA allele type occurs at a low frequency, five times less than the frequency of SbPUL-GD (SbPUL-RA=15.07%, SbPUL-GD=82.65%, het=2.28%; Supplementary Table S4). This strongly highlights the potential that increasing the frequency of SbPUL-RA may provide a way to improve the nutritional value of sorghum.

No deleterious effects are associated with SbPUL-RA