With regard to the manuscripts that the authors withdrew independently, there are possible reasons for withdrawing after being asked to provide raw data. One possible reason is that although they actually had the raw data, the authors were not willing to gather all the raw data and upload them. It’s also possible that authors did not disclose raw data which they could use as an exclusive source for data mining to publish additional papers later. Another possible reason is that they chose journals where the disclosure of raw data is not required at the time of publication. However, the “data mining” hypothesis is unlikely for many of the authors in the cases considered here, since most of the rejected manuscripts did not contain big data that are suitable for data mining, as most of the requested data prior to peer review were images for western blotting or for tissue staining. Note that I asked not only for raw data but also for absolute p-values and corrections for multiple statistical tests; therefore, the possibility cannot be excluded that some of them did not wish to provide absolute p-values or to conduct corrections for multiple tests, though I do not think that these can be the primary reasons for the withdrawal. As for the ones that I rejected, it is technically possible that the insufficiency or mismatch between raw data and results are honest and careless mistakes.

In academia, these are usually the official interpretations that we make. According to COPE (Committee on Publication Ethics)‘s flowchart resource [10] on suspected fabricated data in a submitted manuscript, suspected fabrication should be investigated through contacting the author, and if necessary the relevant institution or regulatory body should be alerted so that they can initiate a full investigation. However, when reviewers or editors see such activities, we do not usually express direct concerns of possible misconduct or initiate official investigations about them, unless truly definitive evidence for misconduct exists. We have a strong tendency or custom to suppose an honest mistake, rather than to suspect a fabrication and to start an official investigation following such protocol. This is probably because the current system of scientific publication is based on the belief or the assumption that the nature of researchers is fundamentally good.

Considering the experiences that I as an Editor-in-Chief described above, a skeptic might raise the possibility that some of the raw data behind a summary graph did not exist, and that the representative image of western blot or immunostaining shown in the figure is from a limited sampling that do not accurately reflect the sample size denoted in the figure legend. At least in some of these cases described in Fig. 1, I cannot help thinking that that data did not exist from the beginning (yes, I am a skeptic). Could lackadaisical attitude towards data – spanning from data fabrication at worst to data neglect at least – have occurred in at least some cases?

We really cannot know what percentage of those manuscripts have fabricated data. Without formal investigation in all suspected cases, I can only speculate. At the same time, I was interested in how researchers on the internet would speculate with me. As such, I conducted a casual survey using Twitter in the Japanese language, asking what possible reasons researchers might have to withhold data when asked by editors before publication and when asked by readers after publication. The translation of the Twitter survey (Additional file 1: Figure S1A) is described in the Supplementary Text (Additional file 1). Approximately 53% of the 227 respondents from the life sciences field answered that they suspect more than two-thirds of the manuscripts that were withdrawn or did not provide sufficient raw data might have had fabricated the data. While this respondence from Japanese-speaking scientists is based on speculation without concrete examples, and is more about ‘gut feelings’, let us hypothesize that their estimation reflects the facts. If this is the case, out of the 40 manuscripts, the authors of 26 manuscripts or more committed data fabrication.

Then, how about 140 other manuscripts that were not considered “too beautiful to be true”? Note that I requested raw data only when I felt that the data were ‘too beautiful to be true’. More experienced and careful researchers than the authors who produce ‘too beautiful’ figures would probably make figures and results, based on non-existing data, that look more realistic so that the error bars and the effect sizes look as modest as those in real data. In such cases where the figures looked real, I did not ask the authors to provide raw data, which was not ideal but practically unavoidable under the current data availability policy of the journal that does not require but just encourages data deposition. It is likely that at least some manuscripts that were sent out for review, albeit not two-thirds as suggested by my online survey, would include some data that were not real. I conducted another twitter survey, and more than 60% of the 56 researchers in the life sciences field, who responded to the survey, thought that, among the 180 manuscripts handled by me, approximately the same number as or more than those whose data were ‘too beautiful’ due to fabrication may have made up their data in a manner that people would not pick it at face value or that the results would look realistic to experts (Additional file 1: Figure S1B). In other words, more than half of the researchers guessed that, among the ones who commit misconduct, the number of careful ones would be equal to or greater than those of careless researchers. Again, supposing that this speculation by 60% of the respondents was the case, this would mean that among the 180 manuscripts, 52 (=26 + 26) or more may involve data fabrication. A casual guess by researchers on Twitter led to a rough estimation that more than a quarter of the manuscripts submitted to our journal may include some misconduct.

A systematic review and meta-analysis of survey data estimated that 1.97% of authors admitted to have fabricated, falsified, or modified data or results at least once and, in surveys asking about the behavior of colleagues, the admission rate was 14.12% for falsification [13]. Misconduct was found to be reported more frequently by medical/pharmacological researchers than others in this systematic review [13], which is consistent with the fact that the authors of 34 out of the 40 manuscripts that did not provide raw data to Molecular Brain belonged to hospital or medical school in the analysis in this editorial. In another study in which the images from a total of 20,621 papers published in 40 scientific journals from 1995 to 2014 were visually screened, 3.8% of the published papers contained problematic figures, with at least half exhibiting features suggesting deliberate manipulation [14]. The estimation that a quarter of the manuscripts I handled may include data fabrication is greater than those previous estimates, although my estimates were just rough and casual speculation based on non-scientific and anecdotal episodes. It is unlikely that our journal, Molecular Brain, has higher incidents of such misconduct than other journals, since I, an Editor-in-Chief, have conducted relatively strict screening before review. The 14 journals that published the rejected or withdrawn manuscripts have impact factors issued from Clarivate Analytics, ranging from 2.219 to 4.658 (mean: 3.37) (Additional file 2: Table S1), and those standard journals that are well-accepted by the scientific community may have this serious problem, too. It should be noted that a positive correlation between “retraction index” and journal impact factor was reported [15], suggesting that high impact journals cannot be immune to this issue, either.

If a significant portion of submitted manuscripts already include data carelessness or fabrication, the reproducibility crisis would be due in part to the absence of raw data. It is not surprising that the results cannot be reproduced if the raw data of the studies do not exist from the beginning. In a survey that asked researchers what led to problems in reproducibility, more than 40% of the respondents chose the options, “raw data not available from original lab” or “Fraud”, as the factors that “always/often contribute” to irreproducible research [9]. This might be one of the most serious concerns in our research community in this era.