And the combined effect of constraints related to the research paper format (including word limits, and only reporting what can be described in words), the tendency of authors to report what they perceive to be important, and rewards for exciting, innovative outcomes is an emphasis on reporting outcomes and their implications, rather than a comprehensive description of the methodology ( Kilkenny et al., 2009 Landis et al., 2012 Moher et al., 2008). For example, selectively reporting experiments or analyses, particularly reporting only those that 'worked', biases the literature by ignoring negative or null results ( Fanelli, 2010 Fanelli, 2011 Ioannidis, 2005 Rosenthal, 1979 Sterling, 1959 Sterling et al., 1995). The standard process for providing this information is to write a research paper that details the methodology and outcomes. To evaluate a scientific claim one needs access to the evidence supporting the claim – the methodology and materials used, the data generated, and the process of drawing conclusions from those data. Transparency is essential in any system that seeks to evaluate the credibility of scientific claims.
In this paper, we report the challenges confronted during a large-scale effort to replicate findings in cancer biology, and describe how improving transparency and sharing can make it easier to assess rigor and replicability and, therefore, to increase research efficiency. Many parts of research could improve to accelerate discovery. Science advances knowledge and is self-correcting, but we do not believe it is doing so very efficiently. We believe everything we wrote in the previous paragraph except for one word in the last sentence – efficiently. Science works because it efficiently identifies those false starts and redirects resources to new possibilities. Self-correction is necessary because mistakes and false starts are expected when pushing the boundaries of knowledge. The evaluative processes of peer review and replication are the basis for believing that science is self-correcting. Once new claims are made public, other scientists may question, challenge, or extend them by trying to replicate the evidence or to conduct novel research. However, the interrogation of new claims and evidence by peers occurs continuously, and most formally in the peer review of manuscripts prior to publication. Science is also relatively non-hierarchical in that there are no official arbiters of the truth or falsity of claims. As a social system, science operates with norms and processes to facilitate the critical appraisal of claims, and transparency and skepticism are virtues endorsed by most scientists ( Anderson et al., 2007). The credibility of knowledge claims relies, in part, on the transparency and repeatability of the evidence used to support them. Science is a system for accumulating knowledge. This experience draws attention to a basic and fundamental concern about replication – it is hard to assess whether reported findings are credible. Cumulatively, these three factors limited the number of experiments that could be repeated. Third, once experimental work started, 67% of the peer-reviewed protocols required modifications to complete the research and just 41% of those modifications could be implemented. While authors were extremely or very helpful for 41% of experiments, they were minimally helpful for 9% of experiments, and not at all helpful (or did not respond to us) for 32% of experiments. Second, none of the 193 experiments were described in sufficient detail in the original paper to enable us to design protocols to repeat the experiments, so we had to seek clarifications from the original authors. Moreover, despite contacting the authors of the original papers, we were unable to obtain these data for 68% of the experiments. First, many original papers failed to report key descriptive and inferential statistics: the data needed to compute effect sizes and conduct power analyses was publicly accessible for just 4 of 193 experiments.
Here we report these barriers and challenges.
However, the various barriers and challenges we encountered while designing and conducting the experiments meant that we were only able to repeat 50 experiments from 23 papers. The initial aim of the project was to repeat 193 experiments from 53 high-impact papers, using an approach in which the experimental protocols and plans for data analysis had to be peer reviewed and accepted for publication before experimental work could begin. We conducted the Reproducibility Project: Cancer Biology to investigate the replicability of preclinical research in cancer biology.