While the analyzed transplant data presented in this manuscript raises severe doubts about the veracity of the claims of ethical reform in China, it is important to highlight qualitative indications that some hospitals in China are clearly engaged in the lifesaving work of voluntary organ procurement, allocation, and transplantation.

On December 31, 2015, Shanghai reported its 138th voluntary deceased donor. A television documentary featured the case of the 125th donor in the city, whose organs were procured at Huashan Hospital in November 2015; 40 doctors were involved in the procurement of organs from the donor; family members, doctors, and nurses were all interviewed [43].

Wuhan Tongji Hospital in Hubei Province reported the establishment of a full-time post for a transplant coordinator from January 2014, outlining the curriculum, the hours studied, the day to day work, and the increase in donations that resulted: from 98 donors in 2013, to 176 in 2015 [44]. Wuhan Tongji is one of the most advanced transplant hospitals in China, having acquired organs from brain dead donors since 2001 [45].

Jilin Province’s Red Cross reports of voluntary deceased transplants (183 donors by end March 2016) broadly correspond with hospital reports (about 220 donors by October 2016) [46].

The preceding examples are consistent with statements by leading Chinese medical administrators that most of the country’s voluntary donation activity takes place at a small number of major transplant centers [11, 47].

Modest success in organ transplant reform at a few key hospitals, however, is not all that Chinese medical officials have claimed. They have claimed a revolution in transplant practice across the country, with numbers of allegedly voluntary deceased donors growing at geometric rates and the cessation of all sourcing of organs from prisoners. It is these claims that are assessed in this report.

Inferences from the data analysis

COTRS data

Analysis of the COTRS 2017 data shows an average divergence of less than 0.08% from perfect quadratic formulae. We believe the assumption that these numbers emerged by chance, or as a coincidence, from organ donations taking place across China, would require an unrealistic and implausible sequence of events persisting for 7 years.

The management of a voluntary organ donation, allocation, and transplantation system involves a highly complex set of processes and interactions between thousands of individuals at hundreds of locations over many years. Almost all instances of donation first require the occurrence of a brain injury, the diagnosis of brain death, the absence in the donor of numerous disqualifying diseases or contraindicating medications, preservation of the donor, gaining consent from the family to donate, blood and tissue typing, crossmatching, location of the recipient (through COTRS), their rapid conveyance to the hospital, multiple organ procurements, and then transplantations by separate medical teams. In many cases each donor triggers activities requiring the coordination of multiple hospitals. In China at the time of this study, 173 hospitals are licensed to perform voluntary donations, with thousands of doctors, nurses, and support staff, in a country of 1.4 billion people. Moreover, during the years in question the donation system was still being constructed. Doctors were being trained in the new diagnostic criteria for brain or circulatory deaths, nurses were being trained as transplantation coordinators, the public was being educated in the need to donate, families were making culturally difficult decisions about the disposition of the bodies of their deceased relatives, and the infrastructure for connecting potential donors with recipients, transplant teams, and other hospitals, was still being built.

Given all of these variables across multiple interlocking processes, the finding that China’s data for voluntary deceased donors, kidney transplants, and liver transplants, conform to three almost perfect quadratic equations is highly surprising. Genuine data generated from a complex, growing system with all the above caveats is not expected to exhibit such a sustained, extremely smooth growth curve.

Additional file 1 compares the R-squared statistics and the mean squared errors from a quadratic against comparable multi-year annual donor and transplant data from 50 other countries in the Global Observatory of Donation and Transplantation database, managed by the World Health Organization and Spain’s National Organization of Transplants. The scatter charts show that China’s R-squared statistics for fit to quadratic formulae are between one and two orders of magnitude outside the range of all the other data, and that the mean squared error for China does not conform to the pattern of any other country. Thus, the fundamental behavior of the Chinese data — not just the rapid growth rate in transplants it depicts — is qualitatively different to every other country for which there is comparable data.

We propose that the most plausible explanation for the COTRS 2017 data is manual and deliberate manipulation in order to fit a target donor rate, with a mathematical function selected as the most efficient way to both 1) reach this goal in an apparently natural manner, and 2) provide a common reference and guide for derivative data through the Chinese system. It is the exquisite precision of the fit of the data to the Procrustean Bed of a smooth mathematical formula that we believe rules out competing explanations.

A unique window of opportunity for the testing of our initial hypothesis was opened when Chinese authorities published updated COTRS data in July 2018.

The statistical analysis of COTRS 2018 data demonstrates that the extra datapoint strengthens the initial hypothesis of data falsification.

While we are aware that a range of values for 2017 would be in conformance with the model, it is highly significant that the value that did appear — 5146 — allowed for a major simplification and therefore strengthened the hypothesis of a mathematical model being used.

The significance of the new datapoint for deceased donors in 2017 is that it lies in a very restricted range that implies a more significant conformance not to a general quadratic equation, but to a one-parameter quadratic, of the form y = a.x2 (where x = 0 corresponds to 2010, the year in which Chinese authorities state they began the system of voluntary deceased organ allocation).

The analysis shows that the far more parsimonious model of y = a.x2 is just as powerful as y = a.x2 + b.x + c for explaining the growth curve. This model simplification significantly increases the leverage of the argument that the data was not generated from real world transplant activity.

In the Results, we further asked: how surprising is it that b and c should become so insignificant? Would this have happened for other 2017 values, or is 5146 in some sense special? This was tested by looking at scenarios where the 2017 value ranged from 4950 to 5350. For each value, the two models (y = a.x2 + b.x + c and y = a.x2) were both fitted, and the reduction in SSE (sum of squares of errors from the model) obtained by using the former instead of the latter was calculated.

The resulting graph (Fig. 3) gave a visual indication that the 2017 value (5146) was almost exactly the value that most strongly reinforces the model simplification to y = a.x2.

We extended the analysis yet further, probing whether the power function also appeared to be the result of a manmade mathematical model. While there is no reason why real world data would conform extremely closely to any power function y = a.xq for any value of q, if it did there is no reason why the power q should be an integer such as q = 2 giving y = a.x2. Thus, we fitted a.xq to the data, for q ranging from 1 to 3, and calculated the (adjusted) r2 for each q.

The resulting graph (Fig. 4) demonstrated that the optimal q went from y = a.x2.07 to y = a.x2.01 between COTRS 2017 and COTRS 2018 data, meaning that the optimal power model was shown to be virtually identical to y = a.x2, further reinforcing the inexplicable and unrealistic simplicity of the real world data.

The simpler the model required to explain the data, the more difficult it becomes to argue that it was generated through a random and complex series of organ donation and transplantation events — and the stronger the leverage in the argument that the data was in fact generated by a simple model in the first place.

A discussion of the statistical significance of these findings, newly possible due to the simplification of the model (which allowed a linear regression to be fitted), is available in Additional file 4.

The Red Cross 2019 data was found to remain consistent with a quadratic formula. While it was not as supportive of the simplified y = a.x2 model as COTRS 2018 was, it was still consistent with it with a high r2.

Central Red Cross data

Data from the China Organ Donation Administration Center, managed by the Red Cross Society of China, is supposed to provide third party witness to and registration of every voluntary transplant. It contains five identified internal anomalies, at least three of which we believe are extremely difficult to explain without human-directed manipulation of the dataset.

Anomaly A1, where only the date but not the data was updated between April 23 and 30 on china-organdonation.org, but www.codac.org.cn showed different data on April 23, is perplexing. It could be interpreted in multiple ways. On its face, if the codac.org.cn update is disregarded, it indicates that no transplant activity at all took place during the intervening 7 day period — a highly unlikely scenario. Or, again disregarding the conflicting data, it could be that the date change was merely a clerical error. Alternatively, the April 23 www.codac.org.cn update may have been the ‘real’ data intended for that date, and the china-organdonation.org update on April 23 an accidental revelation of the predetermined April 30 figures. We have no way of adjudicating between these possibilities, though we believe the accrual of such anomalies speaks to data integrity problems.

Anomaly A2 may have been transcription error.

Anomalies B, C, and D, while potentially having innocent explanations, have the effect of vectoring the internal relationship in the data for transplants per donor back to an arbitrary ratio of 2.75, raising questions as to whether the anomalies are signs of manipulated data.

Anomaly C is the most remarkable, in which for a 10-day period in March 2016, the dataset reports that 21.3 organs were obtained per donor, which is clearly impossible. The addition of this data again “corrected” the dataset to be in line with the arbitrary organs/donor ratio. This anomaly cannot be discounted, because each subsequent cumulative number has built into it this clearly impossible figure. To make the dataset coherent, a series of entries around Anomaly C on 3/20/2016 would need to be retroactively modified. Furthermore, the possibility of backlogs of COTRS data being entered into the database as an explanation for C and D is in direct contradiction to the required procedures of COTRS allocations.

While these anomalies now appear obvious, it is understandable that they have not been discovered until now, as the Central Red Cross data did not previously exist in a series. It is only after the full series was captured, archived, logged, and analyzed that these anomalies revealed themselves.

The extraordinary growth in registered volunteer donors also raises questions. Given the paucity of data on this at the local level, no other metric exists with which to compare the figures, and thus they cannot be invalidated outright. It is possible that the one-day leap of exactly 25,000 on 12/31/2015 was due to a batch upload of data on a single day — yet this pattern did not occur before or since, and was followed by steady growth through 2016. Then in 2016, again at the end of the year, the entire dataset doubles within the space of a week, from 12/25/2016–12/31/2016. These two sudden changes to the data raise questions about the integrity of the series.

Collectively, we maintain that these apparent human-directed alterations of the Central Red Cross data are consistent with the contention that, like the COTRS dataset, the data was not formed by the accretion of individual cases of successful voluntary organ donation and allocation, but instead was manually manipulated to fit the formula-derived COTRS master version, albeit with imperfect results.

Comparison between Central Red Cross and COTRS data

The comparison between the two datasets revealed two important discoveries. The first is that, in confirmation of official statements, they tend to have identical numbers of donors. Although this is not the case for several years, for reasons detailed later, the convergence is very close by the end of 2015 (with a discrepancy of 54 donors) and identical by the end of 2016, where the Central Red Cross database is updated to 9996 donors on the last day of the year, in line with COTRS. Thus, by end 2016 the datasets are in agreement to the day.

In contrast, the number of transplanted organs diverges, even after adding the national heart and lung transplants, which are not reported in the COTRS data. There is an inexplicable gap of 555 transplants for the year 2016.

A potential explanation for this gap that preserves the integrity of the data would be that it is filled by other solid organs — pancreatic and small intestine transplants. This explanation does not appear to hold, however. While data on transplants of these organs in China is difficult to come by, according to a 2011 Chinese medical paper, only 200 pancreas transplants had been performed since 1989 [48]. The vast majority, if not all of these, would have used organs from prisoner sources. Small intestine transplants are far rarer still. There is no subsequent data suggesting that pancreas transplants increased at the extremely high rate that would have been required to close the gap between the Red Cross and COTRS datasets, in particular under the more constrained conditions of voluntary donor sourcing. Thus, the gap in reports of the transplanted organs remains.

The finding of both a concordance between donors in the datasets, yet a persistent, inexplicable gap in organs, casts doubt on the integrity of both datasets. Had the number of donors not been identical, it may have been possible to reason that the two databases are in fact different in some fashion — for example that the Red Cross data captures some transplant activity, while the COTRS data captures other activity.

We believe that the inexplicable divergence supports the hypothesis that the datasets are updated by design, rather than real organ allocation data. Any such manual manipulation contains the potential for the kinds of discrepancies we have discovered.

Comparison between provincial Red Cross data and hospital activity

The examination of hospital-level transplant activity (detailed and discussed in Additional file 5) fails to disconfirm the data-based findings, but rather tends to corroborate them. While this finding cannot be conclusive due to the lack of transparency around hospital activity, we believe that the most plausible interpretation of the data is that it is part of a pattern of data fabrication extending to the provincial level.

General remarks about the data analysis

In light of the accumulation of findings above, we believe that two propositions can be advanced about the datasets under examination.

The first is that the unusual and anomalous features in the data are due to deliberate human intervention. We believe this is the only plausible explanation for the qualities identified in the COTRS, and central and provincial Red Cross data, which include mirroring of quadratic formulae, stubborn adherence to arbitrary ratios, anomalies that abrogate the mathematical integrity of data series, unsubstantiated growth patterns, and other irregularities. It is difficult to imagine how such data from three sources could have come to possess these qualities if not for deliberate, ongoing and imperfect human intervention.

The second is that this intervention could not have been piecemeal or without forethought. This proposition is based primarily on the COTRS datasets, which are comprised of three sets of seven (COTRS 2017), and one set of eight (COTRS 2018), data points that we believe were evidently derived from arbitrary mathematical formulae. Whatever the real data, this result was conceived as a single act: It could not be an accretion of manual updates, since the probability of accretive manipulations arriving at such a perfect mathematical function must be roughly equivalent to such an outcome being arrived at by natural activity. The other two data series — central and local Red Cross figures — have apparently been made in the image of this data. If the COTRS data was falsified through a top-down process, and the central and local Red Cross data adhere to the COTRS data, then they must have been derived in a similar fashion, with top-level coordination to maintain a semblance of congruence. We believe that the failure of congruence in organ transplants between Central Red Cross and COTRS figures, the simplification of the model between COTRS 2017 and COTRS 2018 series, and the failure of hospital activity to substantiate provincial Red Cross figures, all corroborate the hypothesis that the annual figures were handed down as quotas — albeit quotas that were imperfectly implemented across a fractured bureaucratic and administrative apparatus, thus exposing the discrepancies identified.

Additional sources and considerations

The statistical forensics and data-based findings must be situated within the broader context of the challenges, obstacles, and distinctive features of China’s voluntary organ transplant reforms. In broad terms, these include the following, details of which may be found in Additional file 6: