Study shows the need to improve how scientists approach early-stage cancer research

Preclinical studies, the kind that scientists perform before testing in humans, don’t get as much attention as their clinical counterparts. But they are the vital first steps to eventual treatments and cures. It’s important to get preclinical findings right. When they are wrong, scientists waste resources pursuing false leads. Worse, false findings can trigger clinical studies with humans.

Last December, the Center for Open Science (COS) released the worrying results of its eight-year $US 1.5 million Reproducibility Project: Cancer Biology study. Done in collaboration with research marketplace Science Exchange, independent scientists found that the odds of replicating results of 50 preclinical experiments from 23 high-profile published studies were no better than a coin toss.

Praise and controversy have followed the project from the beginning. The journal Nature applauded the replication studies as “the practice of science at its best.”

But the journal Science noted that reactions from some scientists whose studies were chosen ranged from “annoyance to anxiety to outrage,” impeding the replications. Although none of the original experiments was described in enough detail to allow scientists to repeat them, a third of the original authors were unco-operative, and some were even hostile when asked for assistance.

COS executive director Brian Nosek cautioned that the findings pose “challenges for the credibility of preclinical cancer biology.” In a tacit acknowledgement that biomedical research has not been universally rigorous or transparent, the American National Institutes of Health (NIH), the largest funder of biomedical research in the world, has announced that it will raise requirements for both of these qualities.

I have taught classes and written about good scientific practice in psychology and biomedicine for over 30 years. I’ve reviewed more grant applications and journal manuscripts than I can count, and I’m not surprised.

The twin pillars of trustworthy science, transparency and dispassionate rigor, have wobbled under the stress of incentives that enhance careers at the expense of reliable science. Too often, proposed preclinical studies, and surprisingly, published peer-reviewed ones, don’t follow the scientific method. Too often, scientists do not share their government-funded data, even when required by the publishing journal.

Controlling for bias

Many preclinical experiments lack the rudimentary controls against bias that are taught in the social sciences, though rarely in biomedical disciplines such as medicine, cell biology, biochemistry and physiology. Controlling for bias is a key element of the scientific method because it allows scientists to disentangle experimental signal from procedural noise.

Confirmation bias, the tendency to see what we want to see, is one type of bias that good science controls by “blinding.” Think of the “double-blind” procedures in clinical trials in which neither the patient nor the research team knows who is getting the placebo and who is getting the drug. In preclinical research, blinding experimenters to samples’ identities minimizes the chance that they will alter their behavior, however subtly, in favor of their hypothesis.

Seemingly trivial differences, such as whether a sample is processed in the morning or afternoon or whether an animal is caged in the upper or lower row, can also change results. This is not as unlikely as you might think. Moment-to-moment changes in the micro-environment, such as exposure to light and air ventilation, for example, can change physiological responses.

If all animals who receive a drug are caged in one row and all animals who do not receive the drug are caged in another row, any difference between the two groups of animals may be due to the drug, to their housing location or to an interaction between the two. You can’t honestly choose between the alternative explanations, and neither can the scientists.

Randomizing sample selection and processing order minimizes these procedural biases, makes the interpretation of the results clearer, and makes them more likely to be replicated.