Tag Archives: reproducibility

Protocols.io, the online open repository for lab protocols.

As many others, I have collected my fair amount of profiles on professional and Science social networks. LinkedIn, Academia, ResearchGate. The real limitation all these web sites share, is that they basically provide you a showcase, in which you can expose yourself to sport your achievements, share your professional profile, and show up as cool as you can. I have always felt that proper tools for collaboration and information sharing in Science were lacking on the internet. Social networking for scientists is limited to a mere activity of results communication and discussion, whereas it could be really useful to have platforms to share datas and protocols.

That is why, as I have heard about protocols.io on Twitter, this project caught my attention immediately. Protocols.io is an online community serving as repository for experimental protocols in Life Sciences. A free, central, up-to-date and crowdsourced protocol database for life scientists. The project is promoted an maintained by ZappyLab, an organisation of scientists whose goal is to provide tools for protocols and lab methods sharing.

Registration is open and pretty simple. Differently from other science communities, you don’t need to provide an “institutional mail address”, any address goes, and this is great for undergrads and graduate students that may have not an official mail address. You subscribe with your mail, and that is enough to make your way to a growing list of lab protocols. You can share your own protocols, deciding whether to make them publicly available or privately shared with you colleagues only. You may also enjoy the benefits of having a smartphone, as ZappyLab provides an application for Android and iOS, available on marketplaces.

At this very moment I am exploring this website, trying to figure out how to deal with it, but it seems pretty simple and user-friendly. Of course, the amount of available protocols is not really high, but this depends on the number of subscribers. The more we are, the more we share, the more protocols will be available.

I cheer up to this project as I think it may represent a great contribution. We always make a big talking about “open science”, “reproducibility” and freedom of knowledge. But most of the times, we limit to blame the publishing groups for their policies of copyright, invoking a major openness. But what are we doing to help Science openness? Sharing your protocols is a fairly good contribution in this, and I hope that you will put your attention and give your contribution to this amazing project.

 

Reproducibility in computational research: 10 simple rules.

How to keep your work reproducible and replicable? Once you have finished up with your genome- wide analysis, NGS data mining, coding, homology modeling or biostatistics, how can you make your entire job available and testable by other people? The need of a proper strategy to guarantee the reproducibility of research, is a major question in almost any branch of Science, and becomes dramatic in computational research. The large amount of methods available, and the massive quantity of information produced, tend to stultify the efforts in keeping our work replicable and reproducible.

Even if anyone working with computers develops very personal working habits, there is one trick or two to improve them, in order to render your work more reproducible. Broadly speaking, this is what is pointed out in a very recent paper published on Plos Computational Biology. More than a couple of tricks, a real decalogue is proposed to improve the reproducibility of your work. Ten Simple Rules for Reproducible Computational Research, that the authors argue to be pretty effective. As I suggest you to read carefully this very good paper, I just list and discuss each rule.

Rule 1: For Every Result, Keep Track of How It Was Produced. Annotations are fundamental. Very often, one ends up tagging data quickly, just to not forget where they are from. But an extensive and explanatory legend will ease your co-workers and reviewers in understanding what you have done.

Rule 2: Avoid Manual Data Manipulation Steps. Take your time, be patient and write down a couple of code lines. Manual data manipulations are the first source of human error and reduce the verificability of your work.

Rule 3: Archive the Exact Versions of All External Programs Used. Boring and way too much clever, but sometimes fundamental.

Rule 4: Version Control All Custom Scripts. This is something that people tend to underestimate, but still very important. I actually need to improve this part too, to get started you can have a check here.

Rule 5: Record All Intermediate Results, When Possible in Standardized Formats. A fair tabbed file, or a CSV is always an act of love towards your collaborators.

Rule 6: For Analyses That Include Randomness, Note Underlying Random Seeds. Never had to use randomness, but seeds can homologate analyses.

Rule 7: Always Store Raw Data behind Plots. That shouldn’t even be mentioned.

Rule 8: Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected.  Do not share summarized datas only, but let your reviewers take stock of all the steps you did.

Rule 9: Connect Textual Statements to Underlying Results. Results and their interpretation must be clearly connected.

Rule 10: Provide Public Access to Scripts, Runs, and Results. Summarizing, keep your work transparent, and no one gets hurt.

When I was at the high- school, my italian literature professors used to teach me that “a text is good if self-explanatory”. This means that readers must be able to understand your writing even if you are not there to explain it. This is more or less the simple principle one can adopt to improve the reproducibility of computational analysis.

An improve in working habits can definitely help, even if is not going to be enough. The role of journals, and the need to set up shared rules to impose a major transparency and reproducibility, is also widely discussed. Ultimately, as clearly pointed out in this article on Science, success in reproducibility improvement will come by the collaboration between scientists and journals. A more sustainable working habits, and a major transparency in published results.