Why Bad Science Is Sometimes More Appealing Than Good Science

Researchers cite studies that can’t be replicated weirdly often

A recent paper makes an upsetting claim about the state of science: nonreplicable studies are cited more often than replicable ones. In other words, according to the report in Science Advances, bad science seems to get more attention than good science.

The paper follows up on reports of a “replication crisis” in psychology, wherein large numbers of academic papers present results that other researchers are unable to reproduce—as well as claims that the problem is not limited to psychology. This matters for several reasons. If a substantial proportion of science fails to meet the norm of replicability, then this work won’t provide a solid basis for decision-making. Failure to replicate results may delay the use of science in developing new medicines and technologies. It may also undermine public trust, making it harder to get Americans vaccinated or to act on climate change. And money spent on invalid science is money wasted: one study puts the cost of irreproducible medical research in the U.S. alone at $28 billion a year.

In the new study, the authors tracked papers in psychology journals, economics journals, and Science and Nature with documented failures of replication. The results are disturbing: papers that couldn’t be replicated were cited more than average, even after the news of the reproducibility failure had been published, and only 12 percent of postexposure citations acknowledged the failure.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

These results parallel those of a 2018 study. An analysis of 126,000 rumor cascades on Twitter showed that false news spread faster and reached more people than verified true claims. It also found that robots propagated true and false news in equal proportions: it was people, not bots, who were responsible for the disproportionate spread of falsehoods online.

A potential explanation for these findings involves a two-edged sword. Academics valorize novelty: new findings, new results, “cutting-edge” and “disruptive” research. On one level this makes sense. If science is a process of discovery, then papers that offer new and surprising things are more likely to represent a possible big advance than papers that strengthen the foundations of existing knowledge or modestly extend its domain of applicability. Moreover, both academics and laypeople experience surprises as more interesting (and certainly more entertaining) than the predictable, the normal and the quotidian. No editor wants to be the one who rejects a paper that later becomes the basis of a Nobel Prize. The problem is that surprising results are surprising because they go against what experience has led us to believe so far, which means that there’s a good chance they’re wrong.

The authors of the citation study theorize that reviewers and editors apply lower standards to “showy” or dramatic papers than to those that incrementally advance the field and that highly interesting papers attract more attention, discussion and citations. In other words, there is a bias in favor of novelty. The authors of the Twitter study also point to novelty as a culprit: they found that the false news that spread rapidly online was significantly more unusual than the true news.

Novel claims have the potential to be very valuable. If something surprises us, it indicates that we might have something to learn from it. The operative word here is “might” because this premise presupposes that the surprising thing is at least partly true. But sometimes things are surprising and wrong. All of which indicates that researchers, reviewers and editors should take steps to correct their bias in favor of novelty, and suggestions have been put forward for how to do this.

There is another problem. As the authors of the citation study note, many replication studies focus on splashy papers that have received a lot of attention. But these are more likely than average to fail to hold up on further scrutiny. A review focused on showy, high-profile papers is not going to be reflective of science at large—a failure of the norm of representativeness. In one case that I have discussed elsewhere, a paper flagging reproducibility problems failed to reveal the researchers’ own methods, yet this paper has been—yes—highly cited. So scientists must be careful that in their quest to flag papers that couldn’t be replicated, they don’t create flashy but flimsy claims of their own.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American