In a couple of moths from now, the Beyond the PDF 2 meeting will take place in Amsterdam. This meeting is organized by the Force 11 group in order to promote research communications and “e-scholarship”. Basically, it aims to group scientists from different disciplines in order to create discussion and (among other things) promote enhancing, preserving and reusing the work published as scientific publications.
In the previous Beyond the PDF meeting (2011), Philip E. Bourne introduced the TB-Drugome, an experiment that had taken him and his team a couple of years to finish. The experiment took the ligand binding sites of all approved drugs in Europe and USA and compared them against the ligand binding sites of of the M.tb (Tuberculosis) proteins in order to produce a set of candidate drugs that could (as a side effect from their original purpose) cure the disease.
Philip explained that all the results of the experiment were available online, and asked the computer scientists for the means to expose the method and the results appropriately in order to be reused. His purpose was that other people could use this experiment for dealing with other diseases without spending much effort in changing the method they had followed for the TB Drugome.
And that was precisely my objective during my first internship in the ISI. I was not a domain expert in biology, but thanks to the help of the TB-Drugome authors, we finally reproduced the experiment as a workflow in the Wings platflorm. We also exported it as Linked Data, and abstracted the workflow so as to be able to implement any of its steps with different implementations. An example of a run can be seen here.
As it happens in other domains, workflows decay: the input databases change, the tools are updated/changed, etc. I had to add small components to the workflow in order to make it work and preserve it. The results obtained were different, but consistent with the findings of the original experiment. Another interesting fact is that we quantified all the time we took for reproducing all the steps. This quantification effort gives an idea of how much effort must a newcomer put in reproducing a workflow when the authors are helping, just to give an insight of how big this task can be. If I get a travel grant, I’ll share these results in the Beyond the PDF 2 meeting in Amsterdam :).