(A) 16 genes were identified as differentially expressed between unmatched hESC and hiPSC lines described in this study. (FDR<0.15 and fold change >2 or <0.5, see details in the Methods). (B) Definition and number of significantly differentially expressed genes (DEGs) between hESCs and hiPSCs. (C) Dendrogram for all isogenic hESC (blue) and hiPSC (red) lines from Choi et al. based on expression levels of the 16 DEGs identified in A. (D-F) Dendrograms for non-isogenic hESC (blue) and hiPSC (red) lines (Choi et al.) based on DEGs defined in other studies (see Supplementary Fig. 5B). (G-I) Dendrograms for non-isogenic hESC (blue) and hiPSC (red) lines (Phanstiel et al.10) based on DEGs defined in other studies (see Supplementary Fig. 5B). (J) Genes were ranked by the sum of –log10(p-value) of differential expression between HUES2 and HUES3 lines (see Methods for details), and the frequency (top) and placement (bottom) of Phanstiel et al.’s DEGs10 (red circles) within the ranking were determined. (K) Left panel: distribution of Dunn-index-based scores of random gene sets, which measures how well a gene set separates our isogenic samples by genetic background. Larger values indicate better separation. Zero indicates the samples are not separated by genetic background. Each of the 10,000 random gene sets was size- and expression-matched to Phanstiel’s DEGs10. Red vertical line indicates the value for Phantiel’s DEGs10, suggesting a significantly better separation by Phanstiel’s DEGs10 than a random set of genes (p-value = 0.0236). Right panel: distribution of expression levels of the random gene sets computed as the sum of log(TPM+1). Gene sets were stratified according to whether they separate our isogenic samples by genetic background (red) or not (green), showing the separation is not affected by expression levels. Dotted line indicates Phanstiel’s DEGs10.