Autosomal SNP analysis

For the ease of understanding, we first classified different Indian Jewish groups present in our combined dataset (Supplementary Table 1). For autosomal analysis, we merged our data with the data coming from nine different studies11,12,21,22,23,24,25,26,27. The combined data was representing the Indian Jewish populations from two distinct geographical regions of India (Fig. 1). We renamed Bene Israel (Mumbai Jewish)11, coming from the West of India, to Jewish 1 and three South Indian Jewish groups11,12,23 as Jewish 2, Jewish 3 and Jewish 4 (Supplementary Table 1 and Supplementary text).

In the present study we have analysed samples from Cochin Jewish and Mumbai Jewish (Bene Israel) groups and referred them collectively as Indian Jewish (Fig. 1). To measure the genetic differentiation of Indian Jewish in terms of inter and intra-regional as well as at population level, we first calculated Fst (Fig. 2a and Supplementary Table 2). The populationwise comparison analysis showed that the Indian Jewish share close affinity with their local South Asian neighbours, except for Indian Jewish 2 and Indian Jewish 4 (sampled from the same Indian Jewish territory) (Fig. 2a and Supplementary Table 2).

Figure 2 (a) Mean pairwise Fst comparison of Indian Jewish with other regional populations obtained from the autosomal SNP data. (b) Principal component analysis (PCA) of the combined autosomal SNP data of individuals from Eurasia. Full size image

We further used principle component analysis (PCA) to capture the genetic variation of Indian Jewish along the two axes covering the Eurasian landscape (Fig. 2b). Consistent with the Fst analysis, the Indian Jewish were dispersed over the South Asian Indo-European-Dravidian cline. In agreement with the previous study11, the model based clustering method ADMIXTURE identified, three ancestral components among Indian Jewish populations (Fig. 3 and Supplementary Fig. 1). Supporting the Fst and PCA results, the Indian ancestry was overwhelmingly dominant among Indian Jewish, however substantial traces of Middle Eastern ancestry (dark blue component) was also evident. The relative proportion of Middle Eastern component among Indian Jewish was observed between the range of 3–20% (Table 1). Conversely, the Middle Eastern ancestry was largely negligible among their neighbouring Indo-European and Dravidian populations. The spatial worldwide distribution of Middle Eastern (dark blue) component also showed elevation among Indian Jewish populations, which cannot be explained by the isolation by distance scenario, where one can expect the Middle Eastern ancestry gradient in India along the West-East and North-South axes (Supplementary Fig. 2). The Indian Jewish populations differ with each other in the context of harbouring the Near Eastern ancestry, as well as their placement in the PCA plot and Fst variation (Fig. 2 and Supplementary Table 2). Therefore, regardless of the potential existence of a Near Eastern genetic link of Indian Jewish, substantial proportions of their genomes ancestry is shared with Indian populations.

Table 1 The Middle Eastern specific ancestry (%) among Indian populations inferred from ADMIXTURE. Full size table

Figure 3 Results of the populationwise unsupervised ‘Structure-like’ ADMIXTURE analysis (K = 7) of world population projecting various Jewish populations. Full size image

To validate the Middle Eastern admixture in India Jewish populations, we have applied three (f3) and four (f4) population tests23,25,28. The outgroup f3 statistics test showed significantly higher shared genetic drift of three Indian Jewish groups (Jewish 1, Jewish 2, Jewish 4), with the Middle Eastern, than their neighbouring local populations (Supplementary Fig. 3 and 4 and Table 2). Jewish 3 showed lowest affinity to Middle East in comparison with the rest Indian Jewish groups, nevertheless it was significantly higher than most of their neighbouring Dravidian populations (Table 2). The ANI (Ancestral North Indian) admixture proportion calculated from the f4 ancestry test was consistently higher among Indian Jewish populations than their Indian neighbours (Supplementary Table 3). We also found higher number and longer length of segments of ROH among Indian Jewish 1 and Indian Jewish 4 groups, whilst the Indian Jewish 2 had lowest ROH segments among all the Indian Jewish populations (Supplementary Fig. 5).

Table 2 Shared drift (f3) analysis results suggesting Middle Eastern affinity for Indian Jewish populations. Full size table

To identify the population structure of Indian Jewish based on haplotypes and recombination across the genome, we have further used ChromoPainter and performed fineSTRUCTURE analysis29. Consistent with the above analyses, all the Indian Jewish groups receive more number/length of chunks with local South Asian populations than their parental Middle Eastern populations (Fig. 4 and Table 3). Nevertheless, the neighbouring Dravidian local populations have significantly received lower number/length of Middle Eastern chunks in comparison with the Indian Jewish populations (two tailed p value <0.0001) (Table 3), which supports their (Indian Jewish) affinity with the Middle Eastern populations. Among all the four Jewish groups the attraction with the Middle Eastern ancestry was in Jewish 1 > Jewish 2 > Jewish 4 > Jewish 3 order (Fig. 4 and Table 3). Notably, we didn’t find any significant difference of chunk number/length sharing of Indian Jewish with Middle Eastern Jewish vs. non-Jewish populations (Table 3).

Table 3 The average Chunk Lengths and Chunk Counts donated by different population groups to various Indian Jewish groups. Full size table

Figure 4 Comparison of the mean sharing of chunklengths of Indian Jewish vs Indian Dravidian neighbours received from Eurasian populations. Full size image

We applied LD based Alder method30, to estimate the time of admixture between Indian Jewish and their neighbouring local Indian populations. We have used Yemeni Jewish and Druze populations as Middle Eastern, while GIH, Paniya and Kurumba as local Indian surrogate populations (Supplementary Table 4). The Alder analysis of Indian Jewish 1 (by considering a generation time of 30 years), has yielded ~1100 years as the time of admixture with GIH population (Table 4). For three groups of Kerala Jewish (Jewish 2, Jewish 3 and Jewish 4), the time of admixture was oldest for Indian Jewish 4 (1590 years) whereas, Indian Jewish 3 showed a time of admixture of 1100 years. Surprisingly, the admixture time for Indian Jewish 2 was relatively recent (480 years).

Table 4 The time of admixture of Jewish with local Indian populations, inferred from the Alder analysis. Full size table

mtDNA and Y chromosomal analysis

To gain more insight about the sex specific Middle Eastern ancestry among Indian Jewish, we examined maternally inherited mitochondrial DNA (mtDNA) and paternally inherited Y chromosome biallelic markers in large sample sizes (Tables 5, 6 and Supplementary Tables 5,6). Consistent with the autosomal analysis, the mtDNA and Y chromosomal haplogroups were frequently South Asia specific (Tables 5 and 6). Apart from South Asian specific lineages (M2-6, M18, M30, M33, M35-37, M39-40, M64, N5, R5-6, R8, R30 and U2), the Indian Jewish also share 4.6% East Eurasian and 21.1% West Eurasian maternal lineages (Table 5). Among the West Eurasian lineages, subclades of haplogroup H, HV1, J, K, N1a and U5 were absent in their local neighbouring populations, which were otherwise predominant among Middle Eastern Jewish populations. (Table 5 and Supplementary Table 5). Interestingly, subclade K1a1b1a is also detected in Indian Jewish 3, which is one of the major founder lineage of the Jewish diaspora8,17, but was not observed among local Indian populations. The PC (Principle Component) analysis for mtDNA placed Indian Jewish 1 within the Indian cluster, whilst Jewish 3 and Jewish 4 were distracted away from the Indian core cluster because of higher proportion of genetic lineages of Middle East origin (Fig. 5a).

Table 5 The maternal haplogroup sharing of West Indian (Maharashtra), South Indian (Kerala), Indian Jewish and other World Jewish populations. Full size table

Table 6 The paternal haplogroup sharing of West Indian (Maharashtra), South Indian (Kerala), Indian Jewish and other World Jewish populations. Full size table

Figure 5 The placement of various Indian Jewish groups in the PC1 vs. PC2 analysis obtained from haplogroup frequencies for mtDNA; (a) and Y chromosome, (b) within other Eurasian populations. Full size image

Similar to maternal haplogroup distribution, the paternal ancestry of Indian Jewish were also composed with some exclusive Middle East specific haplogroups (E,G, J(xJ2) and I) (Table 6). However, at the present level of resolution, it is not possible to link other common lineages (e.g. haplogroups J2 and R1a), which might have Middle Eastern roots (Table 6). The PC analysis was not well differentiated as in case of mtDNA, because of overwhelming presence of South Asian autochthonous lineages (Fig. 5b). Indian Jewish 3 and Jewish 4 clustered loosely to the South Asian knot, whereas Jewish 1 was located between Middle Eastern Jewish and South Asian populations (Fig. 5b).