Noble laureate and University of Chicago economist Ronald Coase said, ‘If you torture the data long enough, it will confess to anything.’ In a recent working paper, former Chief Economic Adviser Arvind Subramanian has cast aspersions on the GDP data of GoI. Prima facie, the analysis seems to be a case of torturing data enough to make the confession of an overestimation — which, euphemistically stated, is data mining to suit conclusions. So, do the conclusions truly hold?The confession that the paper extracts from the data is that there is a large GDP growth overestimation to the extent of 2.5% after the 2011-12 methodology revisions. The first set of evidence the paper presents, which perhaps in the author’s view is the most compelling, is that correlations between select few indicators & GDP growth have flipped post-2011.Well, correlations have flipped in the 1980s-90s with or without methodology revisions. There were no GDP methodology revisions then. Moreover, several indicators were negatively correlated with GDP growth in the 1980s and 1990s as well. So, the claim that negative correlations between economic indicators and GDP growth are symptomatic of its measurement error is grossly misplaced.Subramanian chooses to split the empirical analysis as pre-2011 and post-2011. A closer assessment of the choice seems like data mining to get preferred inferences. When we split the data in the paper one year before or after — as pre-2010 and post-2010, or pre-2012 and post-2012 — we get identical results of both flipping and negative correlations, as in the paper.It seems that the author cherry-picks 2011 for empirical convenience, to make us believe that 2011 was indeed a point of inflection coinciding with the GDP methodology change. In layman terms, it is as if a doctor is trying to convince you of an ailment that you don’t have, even when empirical medical tests suggest nothing abnormal. So, the doctor manufactures evidence to substantiate his preferred postulations. Another rudimentary, but important, error with the empirics of the paper is that the India sample size chosen is too small for any robust analysis. Consider the quantification of a large 2.5% overestimation of GDP growth and a too-good-to-be-true confidence interval of 2%. The 2.5% estimate sounds remarkable. But the number of data points used to estimate it is too small to have any statistical significance.Econometrics tells you that for estimates to have any reliability, you need to have at least 30 data points. The difference-in-difference estimate of a GDP growth overestimation of 2.5% econometrically is the difference of a mean measured using 10 data points (2002-11) and a mean measured using mere five data points (2012-16) — both use much less than even half of the minimum required data points for any statistical significance.The small standard error provides a seemingly credible 2% confidence interval of the estimate — a consequence of using cross-sectional data of 71 countries. If the empirical setting was such that roughly half of the sample was used as control group (baseline measure) and the other half as treatment group, the reliability of both the 2.5% estimate and its confidence interval would have been high.But the paper uses 70 countries as control and only one country (India) as treatment, something that is bad econometrics. Consequently, both the 2.5% GDP overestimation and its confidence interval are highly suspect.Subramanian makes self-contradictory claims and draws mathematically incorrect conclusions. For instance, the paper claims that import growth less export growth was 1.1% pre-2011 and –0.9% post-2011and that “such staggering declines are simply incompatible with stable underlying GDP growth”. The evidence is correct, but the conclusion is not.If imports outpaced exports by 1.1% in pre-2011, the effect on GDP growth mathematically is negative, which is exactly the opposite of what the paper claims. And if exports outpaced imports by 0.9% post-2011, the effect on GDP growth is positive and, therefore, self-contradicts the paper’s conclusions. These could be oversights, but they are far too many to ignore.India’s GDP estimation process and methodology changes are not whimsical or capricious and have adequate checks and balances. The new GDP methodology is globally more comparable, as it takes into account far greater representation of the Indian economy and is, therefore, more reflective of the real state of the economy.So, academic papers such as these that doubt the improved methodology and then torture data to draw misleading empirical conclusions to sensationalise, don’t help our economy’s cause in any way. Nor do they advance scholarship.