PGT variant validation

Sequencing depth was comparable between the amplified trophectoderm-biopsy DNA from embryos and the parents’ from genomic DNA (mean depth of 48.2× versus 46.1×). Embryo reads were equivalent to the couple’s genomic DNA samples for raw and clean reads, bases aligned and transitions to transversion ratios of 2.071 and 2.081 (Supplementary Table 1 and Fig. 1a). Genome coverage for embryos and couples was comparable at sequencing depths of 4× and 10×. However at 20×, genome coverage was relatively decreased for biopsied embryos at 87.5% compared with 96.4% from genomic DNA (Supplementary Figs. 1b, 4a,b). Therefore, with the exception of the failsafe filter, variant filter sets each had the depth threshold at >10× coverage.

Assembly and mapping for the SNP and indel calls were highly concordant between embryos and couples (Supplementary Fig. 1c–f), except for novel SNPs, which averaged 85,527 (standard deviation [SD] 29,576.6) variants in embryos and 21,663 (SD 1102.4) variants for couples. This was reflected in the high number of LoH regions in embryos (5460, SD 1609 versus 3733, SD 87) that presumably indicates regions of allele dropout.

De novo mutations

As expected for the couple’s male and female partners genomic DNA samples, non-homozygote VAFs showed a normal distribution, with the average centred at 0.5 (indicating 50% of reads per base, Supplementary Fig. 2b). The embryos heterozygote VAF distribution ranged from 0.08 to 0.34 with an average peak at 0.26 and maximum at 0.12 (Supplementary Fig. 2a). This low embryo VAF is believed to represent false positive heterozygote calls from either base misincorporation or read misalignment22. Due to this, the de novo filter included a false-positive filtering gate to remove de novo SNP variants with a VAF < 0.35, the rationale being that the failsafe filter will shortlist potentially dangerous or clinically actionable variants for individual curation. Variations involving deletions >1 bp had a higher VAF than those involving a base change, although we did not alter the filtering based on this as the upper limit was approximately consistent.

An additional quality by depth (QD) threshold of >12 was added to the non-dbSNP variant subfilters. This QD threshold reduced the number of de novo variants flagged for curation from 285 across all the eleven embryos to 57. QD filtering was not applied to the transmitted variants, but when this stringent filter was applied to the non-dbSNP variants, 8/125 unique and pathogenic transmitted variants were removed from reporting.

Variant filters were therefore arranged to classify for each mode of inheritance into two parallel sub-filter sets that all variants would be assessed; one sub-filter of each filter set for annotating variants catalogued in dbSNP and a second for variants not catalogued to date, for which pathogenicity prediction was used (Fig. 1a–c).

Variant trio-calling

Three of the five couples had undergone PGT for autosomal dominant conditions, one for an autosomal recessive condition and one for an X-linked condition (Table 1). To confirm the embryo PGT results, in three of the five couples at least one euploid embryo was available (i.e. affected, carrier or unaffected). To determine the concordance between the whole-genome sequencing results to the HumanCytoSNP-12 BeadArray platform used for the couples clinical Karyomapping cycles, assessment of heterozygote calls (~75,000 variants) indicated >99.0% concordance with whole genome sequencing calls. Comparing the results of the pathogenic variants previously diagnosed during monogenic PGT cycles using Karyomapping to those obtained through whole-genome sequencing indicated complete concordance for both couples and embryos (Table 1). One embryo’s PGT variant had a substantially lower than expected VAF (0.143; 3/21 reads) but as this was a transmitted variant for it was called by the filter pipeline.

Table 1 Couples and embryo numbers by inheritance, disease status and type of variant. Full size table

Pathogenic and predicted pathogenic variant detection in embryos

For the recessive filter there was an average of 0.82 transmitted pathogenic variants found in dbSNP per embryo (build 151, ranging between 1 and 2 stars for ClinVar review status, 0 stars representing no assertion criteria or minimal evidence, up to 4 stars for clinical practice guideline). This is compared to an average of 1.27 non-inherited variants per embryo that were predicted pathogenic (Fig. 2, excluding variants for which the couples had originally sought PGT). In one of the couples, both were heterozygote carriers of the CTFR ΔF508 mutation and resulted in a heterozygote in at least one embryo.

Figure 2 Bar graphs of the filter system for determining the clinically relevant variants proposed for embryo selection for each mode of inheritance: (A) filter sets for determining clinically relevant variants classified as either likely pathogenic or pathogenic and (B) filter sets for variants not yet classified but potentially damaging or disease causative. Filters in each row are successively added to the total number of variants remaining. Full size image

For the dominant filters, 1.27 pathogenic variants per embryo were in dbSNP, compared to a mean of 0.45 non-dbSNP predicted pathogenic variants. To detect transmitted pathogenic or predicted pathogenic variants occurring in regions of allele dropout and/or low-coverage in the amplified embryo DNA compared to parental sequences that used genomic DNA, LoH was used (>95% and 100 variants) for variants which had fewer than 10 reads. An average sum of 2.3 (SD 1.2) pathogenic or predicted pathogenic variants were noted as expected but missing from the embryo sequencing due to low coverage threshold or LoH from all the filters. Pathogenic variants in low-coverage regions were phased using the nearest flanking SNPs of the missing regions to determine the carrier status. A mean of 4.5 (SD 3.7) likely pathogenic or pathogenic variants were found in embryos and a mean of 5.5 (SD 3.4) variants deemed potentially pathogenic and required haplotype curation via LoH to account for dropout of potentially inherited but missing pathogenic variants.

To prevent filtering of true positive de novo mutations, the failsafe filter container was used to capture clinically relevant variants for curation. After elimination of PGT variants, 17 variants were detected in the 11 embryos with review status of 3 stars, of which none were clinically actionable essential or developmental delay genes and were removed following QD filtering. Review status classification revealed that only the failsafe filters had missing calls, with a mean of 2.36 (SD 3.86); none of the variants captured by the failsafe filter resulted in compound heterozygotes derived from transmitted variants. There were no ClinVar review status 1-star (conflicting interpretations) variants found in any of the embryo samples. Similarly, there were no compound heterozygotes, homozygous autosomal recessive or X-linked (in females), or likely pathogenic or pathogenic in American College of Medical Genetics incidental findings variants in embryos or parental genomes. There were 109 unclassified candidate pathogenic de novo mutations across the 11 embryos with nine variants featured repeatedly across multiple embryos, all but two of which occurred in more than one family. There were 10 candidate de novo autosomal dominant variants in four embryos which had a VAF < 0.4 and only one having a VAF > 0.5, indicating the high likelihood of false-positive calls. Addition of the QD minimum threshold to the unclassified filters for QD < 12 reduced the candidate false positive unclassified variant calls to one de novo mutation at the ABL1 locus (rs121913459, VAF 0.63, QD = 20.9) in a single embryo56.

Tandem repeat disease loci analysis

For the 17 loci that Expansion Hunter assessed the tandem repeat number at known disease-causing loci, no parental samples indicated pathogenic repeat numbers. In embryo samples, most of the loci tested provided at least one concordant call in terms of transmission exactness. At three loci, both alleles were discordant: FMR1, ATXN1 and ATXN3.

Copy-number and structural variations

CNVs were assessed by direct transmission and binning reads in 10 kb windows and comparing against inheritance and ClinGen dosage sensitivity scores for pathogenicity. CNVs calls were higher in the embryos compared to parental samples, except for inter-chromosomal structural variants and structural deletions, suggesting a high false-positive rate (Supplementary Fig. 1f and Supplementary Table 1). As anticipated from the Karyomapping results, no pathogenic CNVs were detected (Fig. 3). There was a mean of 2.0 deleterious autosomal recessive structural variations for both couples and embryos compared with a mean of 5.21 and 8.05 structural variations for couples and embryos, respectively, for which triplosensitivity was contributing as autosomal recessive.