Proof that COVID-19 Data Accuracy depends on countries' development levels

April 27, 2020

If COVID-19 data accuracy did not depend on countries' levels of development, then there would be no correlation between the number of countries' confirmed coronavirus cases per 100,000 people and GDP per capita, the Human Freedom Index, or the Corruption Perceptions Index.

But here it is:

A correlation coefficient values range from -1 to 1 and indicate relationships between two variables, where -1 is the absolute negative correlation, 0 is the absence of correlation, and 1 is the absolute correlation.

There is a strong statistically significant correlation between countries' confirmed COVID-19 cases and GDP per capita. The correlation coefficient is 0.8 (95% CI: 0.74 - 0.85), which wouldn't have existed if the data accuracy was the same in every country.

The chart includes only countries with a population of more than 300,000 people. Otherwise small countries like the Vatican would distort the values of confirmed cases per 100,000 people.

But even for the sample of all 197 countries, the correlation coefficient is 0.49 (95% CI: 0.38 - 0.59) that is still a strong statistically significant correlation.

Confirmed COVID-19 cases per 100,000 people also correlate with the Human Freedom Index (by Cato Institute) and the Corruption Perceptions Index (by Transparency International).

For comparison, here's how the graph of coronavirus cases per 100K by total population looks like:

The correlation coefficient is -0.04 (95% CI: -0.19 - +0.11) and the regression line is almost parallel to the abscissa axis, showing no correlation, as expected.

173 countries with a population more than 300,000 ranked by confirmed COVID-19 cases: