This is an imperfect report on the special session on reproducibility that recently took place at the annual conference of the Society for Laboratory Automation and Screening (SLAS). I suggested this panel discussion and co-organized it with Cathy Tralau-Stewart with the hope of having many of the parties that can play a role in improving reproducibility in the same room, sharing and brainstorming together (more on the idea). The panelists were: Cathy Tralau-Stewart — Associate Director, Catalyst & Associate Professor Therapeutics (AJ), University of California San Francisco, San Francisco, CA, USA Ivan Oransky — Co-Founder, Retraction Watch; Distinguished Writer in Residence, New York University’s Arthur Carter Journalism Institute, New York, NY, USA Veronique Kiermer — Executive Editor, PLOS, San Francisco, CA, USA Elizabeth Iorns — Co-founder and CEO, Science Exchange, Palo Alto, CA, USA Tara A. Schwetz — Senior Advisor to the Principal Deputy Director, NIH, National Institutes of Health, Bethesda, MD, USA Richard Neve — Senior Research Scientist, Gilead Sciences For a reproducibility geek like me, the panel discussion was riveting. That means I took rather poor notes, but the summary below should capture some of the important points and ideas that came up during the dynamic 2-hour discussion, masterfully moderated by NPR’s Richard Harris. ---------------------------- LT (introducing the session): There is a tendency to point fingers when discussing reproducibility. Academics say that it’s just a problem for pharmaceuticals. Industry blames sloppy academics. Journals and funders are often scapegoated as the ones who could wave a magic wand and solve it all. The reality is that it is everyone’s problem and everyone has a responsibility and a role to play in improving reproducibility. Richard Harris (RH) to Elizabeth Iorns (EI): You recently published the first results from the RPCB. Tell us a bit about it and how you got involved. Why? EI: It was a personal experience as a cancer researcher, before Science Exchange that made me aware of the issue. Then, after launching Science Exchange, we started to see many requests to validate published results, and the subsequent range of outcomes from these validations was a full spectrum from replicating perfectly to not at all. RH to Ivan Oransky (IO): Why start Retraction Watch? IO: It is a good source of stories. Scott Reuben (Cerebrex) went to prison for fraud. About a year after Adam Marcus covered that, I suggested, “let’s start a blog?” Adam at that point probably didn’t know how to spell “HTML,” and I barely did, either. We started it just at a time, unbeknownst to us, when there was a ten-fold explosion in the number of retractions per year. By the way – retractions should be as a very specific case of failure to reproduce; they’re not necessarily a good solution for the larger problem. RH to Cathy Tralau-Stewart (CTS): Tell us a bit about your background and why you think about this. CTS: After 20 years at GSK, I went into academia at UCSF to help push some of the research towards translation. Academia has great science, but it’s important to help push it forward. Reproducibility is not an industry versus academia issue; it’s a science issue. RH to Tara Schwetz (TS): Why did NIH get involved and not sweep under the rug? TS: Larry Tabak and Francis Collins editorial was the launch for a serious effort from NIH to study this and become more active in trying to improve. (I have a note saying “January of 2016 effect”, but I don’t know what this means.) The changes to the application instructions and review criteria went into effect for applications received after January 25, 2016. RH to Richard Neve (RN): What happened at Genentech that pulled you into this? RN: When I got to Genentech I learned just how inefficient the process was for getting cell lines within the company. Completely on the side, I started to build a cell bank [read more here]. Over time, it grew to 100,000 vials. We designed and implemented high-throughput validation. Learned that while many thought they had great cell lines, they were in fact not good at all. RH: Can academia do this? RN: UCSF has this, but it is costly. Every organization doesn’t need to spend a ton on a cell bank; tests are easy and cost-effective. RH to Veronique Kiermer (VK): One often hears, “if gatekeepers/journals did their job, there would be no reproducibility issues.” VK: Journals are indeed a target, but it’s not a one person or one sector problem. Of course, journals do have a responsibility, and many are taking serious steps. RH to EI: As you learned from the Reproducibility Project work, none of the papers had enough detail to reproduce, is that right? EI: Indeed. The way studies are published now, there is not enough detail to replicate. We can learn from the Reproducibility Project how we can publish original manuscripts to make follow-up easier. Audience comment from Joanne Kamens of Addgene: There is much low-hanging fruit. Much that can be done easily and quickly. EI: Yes, when plasmids are in Addgene, easy. When we need to get them from the lab that published directly, can be over a year. IO: To get people to share protocols and reagents – need the right incentives. Joanne Kamens: The incentives are there for sharing plasmids. If more journals would guide the authors to do so… RN: When we set up the Cell Bank at Genentech, there was reticence, but it saved people time in the end. It’s about behavior and about incentives. In the publish & perish world, there is no time to work properly and take the time to do it right, because of the drive to publish. IO: Another issue is that some researchers don’t want their data to be analyzed by others. I’ve recently spoken at a meeting where a scientist got up and said, basically, “it’s my data, I will publish on it, I don’t want to give it away.” VK: We have to take into account that there are currently real difficulties about sharing some type of data, in particular related to privacy for clinical and individual-patient data. Better infrastructure and consensus on how to share patient data responsibly would help enormously. And it can be very important to share this kind of data. But the argument that such cannot be shared because it somehow belongs to the researcher is a weak one. It is not your data, it is the patient's data. TS: And for basic research too; it’s also not your data - if it’s NIH funded, it’s the taxpayers’ data. VK: Regarding current incentives, we currently reward exploratory work: low sample size, low power, high impact. There are few rewards for publishing replication. EI: NIH does not fund any replication studies. (Silent comment from Lenny Teytelman: NIH indirectly funds replication all the time as it is much of what every academic does, no matter what their grant proposal. We just don’t fund and incentivize the publishing of the replication attempts.) CTS: Let’s not think of it as “replication studies”, but as “quality assurance.” IO: I believe strongly in peer review. It nudges towards quality. But peer review is only one of a dozen tools that promote reproducibility. We need to stop revering the top journals and need to stop revering peer review. EI: Incentives exist for exciting exploratory work, but not replication. Someone needs to pay for this. Audience comment from Laurie Goodman of GigaScience: We need more transparency in the publishing process. Checklists can help. VK to LG: Do you publish these checklists? LG: Good point. We don’t, but we should and will! TS: NIH has also been recommending guidelines. Referring to the Principles and Guidelines for Publishing Preclinical Research https://www.nih.gov/research-training/rigor-reproducibility/principles-guidelines-reporting-preclinical-research RN: Guidelines are good, but we need enforcement. Also, I’d love for our discussion to shift from a focus on problem to one on solutions. What can we all do? My personal feeling is that we need to rethink how we write papers. We write them all wrong. Write them as stories which are fun to read, but need to be dryer, less revisionist with less framing – give the results, methods, tables, figures. Everything I published before is crap and written wrong. Audience comment from Dana Vanderwall: What about metrics around quality? IO: Show me a metric that cannot be gamed. Be careful of what you wish for. RH: Center for Open Science is experimenting with quality badges for psychology. It’s a little silly, but it works. RN: On a positive note, science is somewhat self-correcting. The papers that don’t reproduce tend to not be cited and fall into oblivion. We need a way to upload results of preliminary replications to create a consilience of data. EI: We have noticed with RPCB that even controls often do not reproduce. Getting transparent record of all the replications of the controls would be very helpful. IO: Actually, some pharmaceutical companies do publish a lot of their raw data. VK: There is a momentum for preprints, and they can play an important role. Audience Comment: As a reviewer, it is harder and harder to check papers because of the rapid evolution of fields and techniques. Reviewers are not experts in everything. RN: I review and explicitly state, “I don’t know how to judge this part.” Need separate reviewers: protocol reviewer, statistic reviewer, etc. Also need more internal review. Another audience comment: We need continuous peer review, post-publication. EI: Hard to be the one to say, “I replicated and the effect did not reproduce.” It is a huge challenge to publish disagreements. IO: Being a whistleblower is hard. At least four out of five people who approach us with possible stories end up deciding not to do it. Audience comment Rebecca Davies, UMN: How can we improve training on rigor and reproducibility? TS: NIH had 30 pilot grants for training. (6 awards, 26 admin supplements) RH: Jon Lorsch put out a call: “Who has a good training program? Tell us about it.” No replies. TS: We had people saying, “well, we have a statistics course” or “we have an RCR/ethics course.” IO: “Unfortunately, my role in life was to become a prophet of doom,” says Ivan, introducing this systematic review, which found that “Due to the very low quality of evidence, the effects of training in responsible conduct of research on reducing research misconduct are uncertain.” VK: We need to bring quality assurance to academia. EI: Robots and automation are not as far away as people think. RH: We have just five minutes left, so let’s finish with each panelist proposing one great idea to improve reproducibility. TS: We need to make preprints more valuable. Funding agencies can play a role in this. VK: More transparency in reporting. All publishers should adopt open data. IO: This will sound crazy, but as Ferric Fang and Arturo Casadevall recently wrote, we should think about giving grants as a lottery. CTS: Truly validated reagents. EI: Full data, full reagents – seconded. Also, confirmatory studies must be rewarded. RH: Take heart – academics can learn from the success of clinicaltrials.gov.