We collected cfDNA sequencing data from 198 twin pregnancies. We applied a multinomial logistic regression using the normalized frequency of X and Y chromosomes combined with fetal fraction estimation to predict fetal sex in twins. Table 1 presents the statistical parameters of NIPT sequencing data regarding the total read counts as well as the normalized frequency of sex chromosomes for all the samples. A comparison of an overall fetal fraction distribution in singleton and twin pregnancies is presented in Fig. 1. In the twins dataset, fetal fraction ranges from 5.4% to 23.5%. The distribution of fetal fraction in twins is centered at higher values with an average of 12.1% compared to 9.6% in singleton pregnancies (singleton pregnancies: 9.64 ± 3.52; twin pregnancies: 12.08 ± 3.50; Wilcoxon rank-sum test p < 2.2 × 10−16).

Table 1 Statistical parameters of NIPT sequencing data Full size table

Fig. 1 Distribution of fetal fraction. The histograms show the overall distribution of fetal fraction estimation in 21,912 singleton pregnancies (upper panel) and 198 twin pregnancies (lower panel). The dataset of fetal fraction used to plot the normal distribution in singleton pregnancies was the same presented in Brison et al.19 Full size image

A one-step regression was first applied to determine fetal sex as being female–female (FF), female–male (FM), or male–male (MM). In DCDA twin pregnancies, the pairs can be either monozygotic or dizygotic. For the latter, there are three possible sex combinations, i.e., FF, FM, or MM. In contrast, MCDA/MCMA is a subset of monozygotic twin pregnancies, and evidently can only be either FF or MM. This analysis was first performed in DCDA and MCDA/MCMA samples separately (Fig. 2). In MCDA/MCMA twins, the algorithm applied presents 100% sensitivity and 100% specificity for FF and MM discrimination (95% CI [0.93–1] and 95% CI [0.90–1]) (Fig. 2a). However, in DCDA twins, sensitivity and specificity are diminished due to a possibility of a mixed-sex pair (FM) (90% sensitivity [95% CI [0.74–0.96]] and 94% specificity [95% CI [0.87–0.97]] for FF; 83% sensitivity [95% CI [0.70–0.91]] and 85% specificity [95% CI [0.74–0.92]] for FM; 81% sensitivity [95% CI [0.66–0.91]] and 96% specificity [95% CI [0.89–0.99]] for MM) (Fig. 2b).

Fig. 2 Multinomial logistic regression model to predict fetal sex in twin pregnancies. A one-step regression was applied in MCDA/MCMA a and DCDA samples b separately. A second-step regression was applied in DCDA samples if a twin pregnancy was classified as non-FF to predict whether the twin sex is FM or MM c. The same strategy was used analyzing DCDA and MCDA/MCMA samples together d. The sensitivity and specificity were calculated based on the prediction results. DCDA dichorionic diamniotic, MCDA monochorionic diamniotic, MCMA monochorionic monoamniotic, FF female–female, FM female–male, MM male–male, 95% confidence interval (CI) Full size image

To improve the accuracy of our test we applied a second-step regression in the DCDA samples. Because discrimination of fetal sex relies mostly on the presence or absence of Y chromosome, it is possible to predict with high accuracy whether one of the pairs are male or not in the dizygotic twins. Hence, when the Y chromosome is present the twins can be assumed to be non-FF. Based on this assumption, if the sample is classified as non-FF, a second-step regression was used to predict whether the twin sex is FM or MM. Using this strategy, we were able to discriminate fetal sex with 98% sensitivity (95% CI [0.89–1]) and 95% specificity (95% CI [0.87–0.98]) in the FM group, and with 92% sensitivity (95% CI [0.79–0.97]) and 99% specificity (95% CI [0.93–1]) in the MM group. (Fig. 2c). Four cases of discordant results occurred; one FM case was predicted to be MM, and three MM cases were predicted to be FM. Of note, by incorporating fetal fraction estimation into the regression model a better accuracy was achieved for fetal sex determination (Supplementary Data 1). Plotting the normalized frequencies of Y chromosome reads in function of fetal fraction in the three groups, we can extrapolate the minimum fetal fraction required to discriminate twin sex. Based on the intersecting lines of the three groups, the threshold for discrimination of FM and MM is a fetal fraction of 2.08%, and for MM and FF a fetal fraction of 0.76% (Fig. 3a). This is the theoretical minimum. However, when taking the variation in the normalized Y chromosome reads into account, one can assume that sex determination is accurate for all pregnancies when the fetal fraction is above 10% and will be accurate in over 50% of the cases when above 6%. We also wondered whether the fetal fraction estimation would be different in twin pregnancies with only female fetuses, only male fetuses of male–female fetuses. As expected, no significant difference was observed between these groups (Fig. 3b).

Fig. 3 Fetal fraction correlation in twin pregnancies. a Correlation of fetal fraction estimation with the normalized frequencies of Y chromosome reads in twin pregnancies. Full lines and dashed lines represent the average and ±3*standard deviation (s.d.) of fetal fraction estimation, respectively. b Comparison of fetal fraction among the three different sex groups in DCDA twin pregnancies. FF female–female, FM female–male, MM male–male Full size image

Figure 4 shows the performance of our analysis in all twin pregnancies. All plots present a negative correlation between the normalized frequency of X chromosome against the Y chromosome, and the FF group created clusters clearly different from the others. This evidence points out to the efficiency of using a second-step regression analysis. Also, fetal fraction estimation of each sample is presented in the plots. The icon sizes correspond to a representation of the fetal fraction. We also tested whether the accuracy of our model would be impacted in cases where it is not possible to obtain chorionicity information of the twin pregnancies before NIPT analysis. Analyzing all 198 samples together, the sensitivity and specificity were slightly lower compared to the DCDA samples evaluated separately; 93% sensitivity (95% CI [0.83–0.98]) and 98% specificity (95% CI [0.94–1]) in the FM group, and 96% sensitivity (95% CI [0.89–0.99]) and 97% specificity (95% CI [0.93–1]) in the MM group (Fig. 2d). The 3D plot helps visualize the correlation of the three parameters incorporated in the regression model among the different sex categories (Fig. 5) (3D rotation animation is presented in Supplementary Movie 1). Hence, when chorionicity is not known, the sex of six twins were misclassified: three FM cases were predicted to be MM, and three MM cases were predicted to be FM.

Fig. 4 Performance of multinomial logistic regression model to predict fetal sex in twin pregnancies. Fetal fraction estimation of each sample is presentedas % based on SeqFF method. Each icon size correspond to a representation of fetal fraction estimation (10%, 15%, and 20%) in which the DNA samplesare scaled accordingly. One-step regression analysis was used to discriminate fetal sex in MCDA/MCMA twin pregnancies a. A two-step regressionanalysis was applied in DCDA samples separately b. DCDA dichorionic diamniotic; MCDA monochorionic diamniotic; MCMA monochorionic monoamniotic; FF female–female; FM female–male; MM male–male Full size image