Klaus Kayser has been publishing electronic journals for so long he can remember mailing them to subscribers on floppy disks. His 19 years of experience have made him keenly aware of the problem of scientific fraud. In his view, he takes extraordinary measures to protect the journal he currently edits, Diagnostic Pathology. For instance, to prevent authors from trying to pass off microscope images from the Internet as their own, he requires them to send along the original glass slides.

Despite his vigilance, however, signs of possible research misconduct have crept into some articles published in Diagnostic Pathology. Six of the 16 articles in the May 2014 issue, for instance, contain suspicious repetitions of phrases and other irregularities.* When Scientific American informed Kayser, he was apparently unaware of the problem. "Nobody told this to me," he says. "I'm very grateful to you."

Diagnostic Pathology, which is owned by Springer, is considered to be a reputable journal. Under Kayser’s stewardship, its “impact factor”—a crude measure of a journal's reputation, generated by number of times the article is cited in the published scientific literature—is 2.411, which puts it solidly in the top quarter of all scientific journals tracked by Thomson Reuters in its Journal Citation Reports, and 27th out of the 76 ranked pathology journals.

Kayser’s journal is not alone. In the past few years similar signs of foul play in the peer-reviewed literature have cropped up across the scientific publishing world—including those owned by publishing powerhouses Wiley, Public Library of Science, Taylor & Francis and Nature Publishing Group (which publishes Scientific American).

The apparent fraud is taking place as the world of scientific publishing—and research—is undergoing rapid change. Scientists, for whom published articles are the route to promotion or tenure or support via grants, are competing harder than ever before to get their articles into peer-reviewed journals. Scientific journals are proliferating on the Web but, even so, supply is still unable to keep up with the ever-increasing demand for respectable scientific outlets. The worry is that this pressure can lead to cheating.

The dubious papers aren't easy to spot. Taken individually each research article seems legitimate. But in an investigation by Scientific American that analyzed the language used in more than 100 scientific articles we found evidence of some worrisome patterns—signs of what appears to be an attempt to game the peer-review system on an industrial scale.

For example, one of the articles published in the May 2014 Diagnostic Pathology looks on the surface like a typical meta-analysis of the peer-reviewed literature. Its authors—eight scientists from Guangxi Medical University in China—assess whether different variations in a gene known as XPC can be linked to gastric cancer. They find no such link, and concede that their paper isn't the final word on the matter:

“However, it is necessary to conduct large sample studies using standardized unbiased genotyping methods, homogeneous gastric cancer patients and well-matched controls. Moreover, gene–gene and gene–environment interactions should also be considered in the analysis. Such studies taking these factors into account may eventually lead to our better, comprehensive understanding of the association between the XPC polymorphisms and gastric cancer risk.”

A perfectly normal conclusion for a perfectly ordinary paper. It is nothing that should set off any alarm bells. Yet, compare it with a paper published several years earlier in the European Journal of Human Genetics (which is owned by Nature Publishing Group), a meta-analysis of whether variations in a gene known as CDH1 could be linked to prostate cancer (PCA):

“However, it is necessary to conduct large trials using standardized unbiased methods, homogeneous PCA patients and well-matched controls, with the assessors blinded to the data. Moreover, gene–gene and gene–environment interactions should also be considered in the analysis. Such studies taking these factors into account may eventually lead to our better, comprehensive understanding of the association between the CDH1−160 C/A polymorphism and PCA risk.”

The wording is almost identical, down to the awkward phrase, "lead to our better, comprehensive understanding." The only substantial differences are the specific gene (CDH1 rather than XPC) and the disease (gastric cancer rather than PCA).

This is not a simple case of plagiarism. Many seemingly independent research teams have been plagiarizing the same passage. An article in PLoS ONE may eventually lead to "our better, comprehensive understanding" of the association between mutations in the XRCC1 gene and thyroid cancer risk. Another in the International Journal of Cancer (published by Wiley) might eventually lead to "our better, comprehensive understanding" of the association between mutations in the XPA gene and cancer risk—and so on. Sometimes there are minor variations in the wording but in more than a dozen articles we found almost identical language with different genes and diseases seemingly plunked into the paragraph, like an esoteric version of Mad Libs, the parlor game in which participants fill in missing words in a passage.

We have found other examples of fill-in-the-blanks research. A search for the phrase "excluded because of obvious irrelevance" retrieved more than a dozen research articles of various types—all but one written by scientists from China. "Using a standardized form, data from published studies" also yields more than a dozen research articles, all from China. "Begger's funnel plot" gets dozens of hits, all from China.

“Beggers funnel plot” is particularly revealing. There is no such thing as a Beggers funnel plot. "It doesn't exist. That's the point," says Guillaume Filion, a biologist at the Center for Genomic Regulation in Barcelona, Spain (pdf). A statistician named Colin Begg and another statistician named Matthias Egger each invented tests and tools to look for biases that creep into meta-analyses. "Begger's funnel plot" appears to be an accidental hybrid of the two names.

Filion spotted the proliferation of "Begger's" tests by accident. While looking for trends in medical journal articles, he found papers that had almost identical titles, similar choices in graphics and the same quirky errors, such as "Begger's funnel plot." He reckons that the papers came from the same source, even though they are ostensibly written by different groups of authors. "It's difficult to imagine that 28 people independently would invent the name of a statistical test," Filion says. "So that's why we were very shocked."

A quick Internet search uncovers outfits that offer to arrange, for a fee, authorship of papers to be published in peer-reviewed outlets. They seem to cater to researchers looking for a quick and dirty way of getting a publication in a prestigious international scientific journal.

In November Scientific American asked a Chinese-speaking reporter to contact MedChina, which offers dozens of scientific "topics for sale" and scientific journal "article transfer" agreements. Posing as a person shopping for a scientific authorship, the reporter spoke with a MedChina representative who explained that the papers were already more or less accepted to peer-reviewed journals; apparently, all that was needed was a little editing and revising. The price depends, in part, on the impact factor of the target journal and whether the paper is experimental or meta-analytic. In this case, the MedChina rep offered authorship of a meta-analysis linking a protein to papillary thyroid cancer slated to be published in a journal with an impact factor of 3.353. The cost: 93,000 RMB—about $15,000.

The most likely intended outlet for the MedChina-brokered paper is Clinical Endocrinology. It is one of five journals with an impact factor of 3.353 and the closest in subject matter. "Obviously, it's a matter of great concern," says John Bevan, a senior editor at the journal. "I'm distraught to think of this going on and flooding the market." Approximately two weeks after being contacted by Scientific American Bevan confirmed that a suspicious-looking article about biomarkers for papillary thyroid cancer—and which had an author added during the paper revisions—was identified and rejected.

Much of the funding for these suspect papers comes from the Chinese government. Of the first 100 papers identified by Scientific American, 24 had received funding from the National Natural Science Foundation of China (NSFC), a governmental funding agency roughly equivalent to the U.S.'s National Science Foundation. Another 17 acknowledged grants from other government sources. Yang Wei, president of NSFC, confirmed that the 24 suspicious papers identified by Scientific American were subsequently referred to the Foundation's Bureau of Discipline, Inspection, Supervision and Auditing (pdf), which investigates several hundred allegations of misconduct each year. "Tens of disciplinary actions have been taken by NSFC annually for research misconduct, though cases of ghostwriting are less common," Yang e-mailed. Last year one of the agency's disciplinary actions involved a scientist who purchased a grant proposal from an Internet site. Yang stresses that the agency takes steps to combat misconduct, including the recent installation of a "similarity check" for possible plagiarism in grant proposals. (In the year since the system went online the check found several hundred cases of "considerable similarities" out of some 150,000 grant applications, Yang claims.) But when it comes to paper mills, Yang says, "we do not have much experience about this issue and are certainly glad to listen to your suggestions."

Some publishers are only now catching up to the problem of Chinese paper mills. "I wasn't aware there was a market out there for authorship," says Jigisha Patel, BioMed Central's associate editorial director for research integrity. Now that BioMed Central (which is owned by Springer and publishes Diagnostic Pathology) has been alerted to the issue, Patel says,"we now can look into it and address it." Within two weeks of being contacted by Scientific American, BioMed Central announced that it had identified roughly 50 manuscripts that had been assessed by phony peer reviewers. The publisher told the Retraction Watch blog that "a third party may be involved, and influencing the peer review process." It is possible that these manuscripts came from paper mills. We were able to look at the titles and authors of about half a dozen of those papers. All appear very similar in style and subject matter to other paper mill-written meta-analyses, and all were from groups of Chinese authors.

Other publishers have begun to combat the flood of dubious papers. Damian Pattinson, editorial director of PLoS ONE, says the journal instituted safeguards last April. "[E]very meta-analysis we get has to go through a specific editorial check..." that forces authors to provide additional information, including a justification for why they performed the study in the first place, he says. "As a result of this, the rate of papers that are actually getting to reviewers has dropped by about 90 percent. So we are very aware of this issue." Even so, the list compiled by Scientific American contains four suspect papers that were published in PLoS ONE after the safeguards were instituted, and authorship on an upcoming PLoS ONE article was put up for sale by MedChina as this article was being written. When we asked Pattinson about these, he replied: “We will correct and retract papers if there is any indication of misconduct. It’s a problem issue and one that we’re very aware of.”

BMC, Public Library of Science and other the publishers use plagiarism-checking software to try to cut down on fraud. Software, however, doesn't always solve the issue of plagiarism in journals, Patel warns, paper mills “add another layer of complexity to the problem. It's very worrying."

Publishers at the moment are fighting an uphill battle. "Without insider information it's very difficult to police this," Clinical Endocrinology's Bevan says. CE and its publisher, Wiley, are trying to close loopholes in the editorial process to flag suspicious late changes in authorship and other irregularities. "You have to accept that people are submitting things in good faith and honesty," Bevan says.

That is the essential threat. Now that a number of companies have figured out how to make money off of scientific misconduct, that presumption of honesty is in danger of becoming an anachronism. "The whole system of peer review works on the basis of trust," Pattinson says. "Once that is damaged, it is very difficult for the peer review system to deal with."

"We've got a problem here," Filion says. He believes that the deluge is just beginning. "There is so much pressure and so much money at stake that were going to see all sorts of excesses in the future."

Additional reporting by Paris Liu.

The list below is 100 published articles that would seem to have the hallmarks of fill-in-the-blanks science. Inclusion in this list does not imply that any given article was written by a paper mill nor does it imply that the article is definitely plagiaristic. Given the pattern of writing and these articles' similarities to previously published work, however, we believe they are worthy of scrutiny by their publishers. >>View the list

There are many more suspicious articles out there; and more are being published every day. These are simply the first 100 we found.

*Correction (2/13/15): This sentence was edited after posting. The original cited the number of articles in the May 2014 issue of Diagnostic Pathology as 14.

Further Reading:
Filion, Guillaume. "A flurry of copycats on PubMed."

Oransky, Ivan. "Publisher discovers 50 manuscripts involving fake peer reviewers."

Ioannidis J.P.A., Chang C. Q., Lam T. K., Schully S. D., Khoury M. J. "The Geometric Increase in Meta-Analyses from China in the Genomic Era." PLoS ONE 8(6): e65602. doi:10.1371/journal.pone.0065602

Hvistendahl, Mara. "China's Publication Bazaar." Science, 29 November 2013, pp. 1035–1039. DOI: 10.1126/science.342.6162.1035