If 3.8% of the 8,778,928 biomedical publications indexed in PubMed from 2009-2016 contain a problematic image, and 10.6% of that group contain images of sufficient concern to warrant retraction, then we can estimate that approximately 35,000 papers are candidates for retraction due to image duplication.
This startling extrapolation comes from an analysis of inappropriate images in papers published in one journal in a seven-year span, performed by a group of American biologists. They uploaded their findings in the form of a preprint paper to the biorXiv server on June 24.
The journal in question is Molecular and Cellular Biology. The biologists analysed 960 papers published in the journal between 2009 and 2016, and found that 6.1% (59) of them contained “inappropriately duplicated” images. Of these 59, five were subsequently retracted from the journal, no action was taken in 12 cases and 42 of them were corrected without retraction.
The findings advocate that journals should institute measures to screen and correct/remove problematic images from submitted manuscripts before publication instead of allowing them to published as-is and, once issues are identified, tackle them later. Apart from distorting the scientific record, the biologists also presented a revealing statistic: that pursuing remedial actions post-publication cost 6 hours of staff time per paper on average whereas pre-publication, the same time cost was 30 minutes.
Although projecting that 35,000 papers might have to be retracted sounds alarming, it’s certainly not surprising as far as the prevalence of mistakes is concerned. “My impression is that a majority of problems are the result of honest mistakes,” Arturo Casadevall, a professor of molecular biology at the Johns Hopkins School for Public Health, Baltimore, and one of the coauthors of the preprint, told The Hindu. “The fact that these papers could be corrected means that the figure problems did not affect the conclusions of the study.”
In October 2017, two scientists from the Netherlands reported that there were over 32,000 papers in the scientific literature that had described some finding or other drawn from analysis of contaminated cells. While the conclusions of these papers are not entirely void, they are not relevant to the pristine cells the papers authors’ thought they were working with. Compounding the problem was that over 500,000 other papers had cited these flawed ones to build on the conclusions, precipitating an avalanche of cascading mistakes.
While people commonly defer to science to provide an objective interpretation of the natural universe, and tout its purportedly self-correcting tendencies, the scientific record insofar as it is represented in the technical literature is continuously flawed and never perfect. Corrections to this record typically emerge in the order of decades, and it pays to be aware of these trends when we compose a concept of how science actually works and what leeway it deserves.
As Casadevall said, most mistakes pertaining to the images in their analysis were ‘honest’ – as they presumably were in the case of papers reporting findings based on contaminated cell lines. However, this isn’t to say there can’t be a more organised sickness in the system of scientific research in the form of researchers deliberately manipulating images to score papers.
A common – and not inappropriate – example festers in India. Here, due to a combination of factors too numerous, yet well known, to detail (see here and here, for example), researchers often publish sub-par papers in predatory journals that typically don’t subject these texts to peer review and, in fact, will publish anything at all for a substantial fee. An analysis of papers published by Indian authors in 2016 revealed that “India had a 1.93-times higher-than-predicted ratio of papers containing image duplication”.
A part of the problem here is the absence of a proper redressal or sanction mechanism. While universities include anti-misconduct pledges in policy documents, complaints of misconduct against a faculty member can only be directed towards the offender’s colleagues, and there are no extra-university regulations for what should happen next nor in what timeframe. As a result, swift and fitting action is almost unheard of, even against high-profile offenders.
Elisabeth Bik, a microbiologist and scientific editorial director of uBiome, a company that provides microbiome-based precision medicine services, led the 2016 analysis as well as was involved in the analysis of papers in Molecular and Cellular Biology. She wrote in the former paper, “Papers containing inappropriately duplicated images originated more frequently from China and India and less frequently from the United States, United Kingdom, Germany, Japan, or Australia” and that “ongoing efforts at scientific ethics reform in China and India should pay particular attention to issues relating to figure preparation”.
The bulk of problematic papers emerging from India are marred by plagiarism and self-plagiarism. One study showed in 2010 that of every 100,000 papers published from India in the decade from 2001, 18 are fraudulent (the global average was 4). It only included data plagiarism in his analysis, and classified self-plagiarism and plagiarism of the text as ‘mistakes’. When the latter two factors are also classified as fraud, the fraud rate jumps to 44. According to another study, the fraction of retracted articles published from India jumped from 0.017% in 2001-2005 to 0.037% in 2006-2010.
T.A. Abinandanan, a materials scientist at the Indian Institute of Science, Bengaluru, who performed the India analysis, has written on his blog, “Retraction of papers from Indian authors shows a steep fall since 2007 – either because Indian researchers know better now or because plagiarised papers are ever less likely to make it to print in the first place due to increasingly widespread use of plagiarism detecting software by journals.”