‘What can we do to make a greater fraction of studies reproducible?‘. This answer focuses on US biomedical research though similar forces at play the world over make its observations broadly generalizable.
In the normal thrum of how science operates in a typical lab, re-testing something from a publication is very much the norm though not explicitly done to confirm an entire study. More often, the lab boss will nudge their graduate students, post-docs or technicians to add some new controls to their experiments, one or more of which might be key aspects of a hot new study. Ambitious trainees may even take it upon themselves to do so on their own initiative.
Bits and pieces of an entire study thus get confirmed over time by bits and pieces in many other studies done by various labs and groups in service of their own research priorities and ideas, not as efforts to confirm another’s results. The strength of scientific methodology in a nutshell. That it works as it’s supposed to is the very reason that issues of reproducibility of published studies have increased in recent years. That’s science working as it’s supposed to.
However, increasing reports of irreproducibility also suggest something’s awry. Akin to blips across a radar screen, now and then articles in leading scientific journals as well as the general news media light up about this issue. In recent years, psychology and biomedical research have figured prominently. A bit of grumbling, teeth gnashing and hand-wringing later, the scientific status quo settles back into place, little changed, meaning we haven’t yet reached a tipping point.
How could it be otherwise when change entails changing so much? Not just how but also why scientists do their work as well as the metrics used to assess the quality of their work, and to reward and promote them. Rather than pretend the problem of study reproducibility is reducible to an easily digestible list of nostrums especially when it encompasses such different areas as basic, translational and clinical research, this answer briefly outlines
- Key questions for everyone in biomedical research from practitioners and gatekeepers to decision-makers.
- How the unforeseen consequences of incentives buttress the current system and give short-shrift to reproducibility.
Where reproducibility in biomedical research is concerned the key questions remain,
- Will top-tier scientific journals publish reproducibility studies at a similar clip to those deemed novel? A recent dust-up involving (NEJM) inadvertently revealed hidden disdain for data sharing and independent data analysis. Arguably one of the world’s most prestigious medical journals, an opinion piece in 2016 by the journal’s chief editors about the pros and cons of data sharing referred to the concern (see below from )
‘among some front-line researchers that the system will be taken over by what some researchers have characterized as “research parasites.”
This article was greeted by howls of derision by some, led to the brief trending of the hashtag, #IamAResearchParasite, and became beset by accusations of ‘paternalistic arrogance‘ (). A year later, , the journal’s editor-in-chief, did a 180, now calling for (see below from ),
‘“culture change” in the scientific community about clinical trials: Instead of solely glorifying researchers who author papers, scientists should also bestow reverence upon those who generate high-quality data sets for others to analyze.’
Whether that will happen is anyone’s guess.
- Will grant funders and employers reward similarly (promotions, tenures, prestige, etc.) those who share data, or re-analyze, curate or annotate others’ data as they do those who publish original studies?
These are just the obvious changes necessary to enhance reproducibility’s priority and prestige in biomedical research. However, they address what, not why as in why has it even become necessary to ask such questions?
Unforeseen consequences of incentives buttress the current system and give short-shrift to reproducibility
Obviously policies and regulations alone won’t suffice here because this is about changing unspoken norms in scientific culture. In a way, the buzzy reproducibility crisis may be a sign of biomedical research becoming a victim of its own success. For long, scientific methodology has proven a reliable attribute. No matter the biases that might plague individual scientists, no matter the false starts and dead-ends, the idea goes, facts eventually shake out because scientific methodology prevails.
Even, whose most controversial work, , pre-eminent philosophers of science such as considered quite the poke in the eye, essentially believed in the primacy of scientific methodology. If it prevails no matter what, reproducibility shouldn’t even be an issue, right? However the reproducibility crisis suggests not, bringing us back to how biomedical research may have become a victim of its own success.
is a blog maintained by Shane Parrish. Its tagline says it all, ‘Mastering the best of what other people have already figured out‘. A thought-provoking article in May 2016 about the unintended and often corrosive consequences of incentives ( ) does such an excellent job of illustrating the issue with a couple of historical examples, I quote the first one below in its entirety,
‘During British colonial rule of India, the government began to worry about the number of venomous cobras in Delhi, and so instituted a reward for every dead snake brought to officials. In a wonderful demonstration of the importance of second-order thinking, Indian citizens dutifully complied and began breeding venomous snakes to kill and bring to the British. By the time the experiment was over, the snake problem was worse than when it began. The Raj government had gotten exactly what it asked for.’
The second downright grisly example concerns King Leopold II of Belgium and how (see below from),
‘Looking to bolster an economy of rubber, Leopold II got an economy of severed hands. Like the British Raj, he got exactly what he asked for.’
Similarly, unforeseen consequences of incentives that privilege not process but certain outcomes help explain biomedical research’s reproducibility crisis, though obviously these outcomes aren’t as grisly for humans, especially if we choose to ignore outcomes for mice and other experimental animal models.
How to assess scientific excellence? The process currently in place is largely the outcome of empiricism initiated in the early years of the 20th century by statisticians such aswho with his coining of the Homo Scientificus Americanus (5) attempted to measure the scientific ‘productivity’ of ‘men of science’. Eventually this notion of productivity coalesced around output, i.e., number of scientific papers ( , ).
But how to deem a submission worthy of publication? A predictable response was to ask whether the work was novel. Asgot codified, eventually every decision maker in the US biomedical enterprise began to prioritize and reward novelty above all. Scientific publications, promotions, grants, tenure, each of the badges necessary for a successful scientific career consider novelty of the work a necessary criterion.
What began as a surrogate to suss out excellence, over time through a combination of expediency and complacency, novelty has instead become enshrined as the centerpiece that consumes all of the proverbial oxygen in the decision-making process to identify scientific quality, even at the expense of other essential attributes such as reproducibility and even as assessment became synonymous with measurement, specifically quantity became a surrogate for quality.
As such tendencies spread the world over, additional metrics were contrived to measure quantity, *cough* quality, how many times a given paper’s cited or thebeing case in point. Meantime, academic journals jumped to differentiate themselves from the pack using metrics such as the . Academic journals mushroomed in the internet era, a clear sign that the American approach to assess scientific quality had globalized. However, that publication in top journals doesn’t guarantee study reliability only underscores the low priority accorded reproducibility ( ).
Meantime, other decades-old sociological trends in biomedical research further entrenched novelty, with recent generations of biomedical scientists getting trained to value and single-mindedly focus on it even as the larger social forces shaping scientific enterprise have contrived to minimize the value of reproducibility.
- One is the intensified competition for jobs and research funds. Competition influences the practice of biomedical research in many consequential ways.
- It places a greater reliance on conveniently quantifiable metrics to assess scientific excellence and, over course of the 20th century, scientific publications increasingly become an expedient peg to hang that hat on.
- It amplifies even as it truncates project timelines, with the Great Recession bringing such existing constraints to a boil by sharply curtailing research funding.
- Tenured staff stay longer at their jobs even as US universities churn out ever increasing numbers of PhDs without expanding tenured positions enough to absorb them, the ensuing glut intensifying competition as well as conveniently feeding publish-or-perish by providing a ready-made army of well-trained, relatively cheap labor to do the work necessary to keep biomedical research humming along.
- Disproportionate focus on publication worthiness reinforces the reproducibility problem with negative data tending to not see the light of day. Now called , first alluded to it as the file-drawer problem all the way back in 1979. Data not supporting previously published work doesn’t get submitted for publication at all, an expression of self-censorship.
- Much shorter lifespan of technical methods feeds the tendency to minimize reproducibility. So rapid is the current rate of change of many lab techniques, some even turnover with the turnover of lab staff such as graduate students and post-docs. Such newer trends make reproducibility as a priority even more out of reach.
1. Longo, Dan L., and Jeffrey M. Drazen. “Data sharing.” (2016): 276-277.
2. ProPublica, Charles Ornstein, April 5, 2016.
3. STAT news, Ike Swetlitz, April 4, 2017.
4. Incentives gone wrong: Cobras, severed hands, and shea butter. Farnam Street blog, May, 2016.
5. Cattell, J. McKeen. “Homo scientificus americanus.” Science 17.432 (1903): 561-570.
6. Godin, Benoît. “From eugenics to scientometrics: Galton, Cattell, and men of science.” Social studies of science 37.5 (2007): 691-728.
7. Godin, Benoît. “The value of science: changing conceptions of scientific productivity, 1869 to circa 1970.” Social Science Information 48.4 (2009): 547-586.