Some time ago, I wrote a small post to guide people on how to easily develop the documentation of your ontology when publishing it on the Web. The ontology documentation is critical for reuse, as it provides an overview of the terms of the ontology with examples, diagrams and their definitions. Many researchers describe their ontologies in associated publications, but in my opinion a good documentation is what any potential reuser will browse if they want to include the ontology on their work.
As I pointed out in my previous post, there are several tools to produce a proper documentation, like LODE and Parrot. However, these tools focus just in the concepts of the ontology, and when using them I found myself facing three main limitations:
That the tools are in web services external to my control, and whenever the ontology is larger than a certain size, the web service will not admit it.
That whenever I want to export the produced ontology documentation, it’s not straightforward: I have to download a huge html and it dependencies from the browser.
That if I want to edit the ontology documentation adding an introduction, diagrams, etc., I have to edit the huge downloaded html. This is cumbersome, as finding the spot where I want to add new contributions is difficult. Normally the edition of the text is mandatory, as some of the metadata of the ontology is not annotated within the ontology itself.
In order to face these limitations, I decided to create Widoco, a WIzard for DOCumenting Ontologies, more than a year ago. Widoco is based on LODE and helps you creating the ontology in three simple steps: introducing the ontology URI or file, completing its metadata and selecting the structure of the document you want to build. You can see a snapshot of the wizard below:
Originally, Widoco produced the documentation offline (no need to use external web services and without a limit for the size of your ontology) and the output was divided in different documents, each of them containing a new section. That way, it was more manageable to edit each of them. The idea here is to be similar to Latex projects, where you include the sections you desire on the main document and comment those you don’t want to include. Ideally, the document would readapt itself to show only the sections you want, dynamically.
After some work, I have just released the version 1.2.2 of the tool, and I would like to comment some of its features here.
Metadata gathering improvements: Widoco will aim to extract metadata from the ontology itself, but that metadata is often incomplete. With Widoco now it is possible to introduce many metadata fields on the fly, if the user wants them to be added to the documentation. Some of the latest added metadata fields indicate the status of the document and how to properly cite the ontology, including its DOI. In addition, it is possible to save and load the metadata properties as a .properties file, in case the documentation needs to be regenerated in the future. As for the license, if an internet connection is available, Widoco will aim to retrieve the license name and metadata from the Licensius web services, where an endpoint of licenses is ready for exploitation.
Access to a particular ontology term: I have changed the anchors in the document to match the URI of the terms. Therefore, if a user derreferences a particular ontology term, he/she will be redirected to the particular definition of that term in the document. This is useful because it saves time when looking for the definition of a particular concept.
Automatic evaluation: If an internet connection is available, Widoco uses the OOPS! web service to detect common pitfalls in your ontology design. The report can be published along with the documentation.
Towards facilitating ontology publication and content negotiation: Widoco now produces a publishing bundle that you can copy and paste in your server. This bundle is published according to the W3C best practices, and adapts depending on whether your vocabulary is hash or slash.
Multiple serialization: Widoco creates multiple serializations of your ontology and points to them from the ontology document. This helps any user to download their favorite serialization to work with.
Provenance and page markup: The main metadata of the ontology is annotated using RDF-a, so the web searchers like Google can understand and point to the contents of the ontology easily. In addition, an html page is created with the main provenance statements of the ontology, described using the W3C PROV standard.
Multilingual publishing: Ontologies may be described in multiple languages, and I have enabled Widoco to generate the documentation in a multilingual way, linking to other languages on each page. That way you avoid having to run the program several times for generating the documentation in different languages.
Multiple styles for your documentation: now I have enabled two different styles for publishing the vocabularies, although I am planning to adapt the new respec style from the W3C.
Dynamic sections: For each section added in the document, the user will not have to worry about their numbering, as it will be done automatically. In addition, the table of contents will change accordingly to the sections the user wants to include in the final document.
Due to the amount of requests, I also created a console version of Widoco, with plenty of options to be able to run all the possible combinations of the features listed above. Even though you don’t need internet connection, you may want it for accessing Licensius and OOPS! webservices. Both the console version and desktop application are available through the same JAR, accessible in the Github: https://github.com/dgarijo/Widoco/releases/tag/v1.2.2
I built this tool to make my life easier, but it turns out that it can be used to make the life of other people easier too. Do you want to use Widoco? Check out the latest release on Github. If you have any problems open an issue! Some new features (like an automated changelog) will be included in the next releases.
The last week of January I was invited to a Dagstuhl seminar about reproducibility in e-Science, and I think it would be helpful to summarize and highlight some of the results in this blog post. A more detailed report will be published in the next few months; so take this as a sneak peek. If you want to reference any of the figures or tables in the summary, please cite the Dagstuhl report.
So… what are Dagstuhl seminars?
They consist on one week meetings that group together researchers of a community to discuss about a certain topic. The seminars are held in the Dagstuhl school of informatics, near Wadern, a location far from any big city. Basically, the purpose of these seminars is to isolate the participants from the world in order to push forward discussions about the topic at hand.
The seminar was organized by Andreas Rauber, Norbert Fuhr and Juliana Freire, and I think they did a great job bringing people from different areas: Information retrieval, psychology, bioinformatics, etc. It would have been great to see more people from libraries (who have been in charge of preserving knowledge for centuries) and editorials and funding agencies, as in my opinion they are the ones who can really push forward reproducibility by making authors comply with reproducibility guidelines/manifestos. Maybe we can use the outcomes of this seminar to convince them to join us the next time.
To be honest, I was a bit afraid that this effort would result in just another manifesto or set of guidelines for enabling reproducibility. Some of the attendants in the seminar shared the same feeling, and therefore one of the first items of the agenda resulted in summaries of other reproducibility workshops that other participants had attended to, like the Euro RV3 workshop or the Artifact Evaluation for Publication workshop (also held at Dagstuhl!). This helped shape a little bit the agenda and move forward.
Tools, state of the art and war stories:
Discussion is the main purpose of the Dagstuhl seminar, but the organizers scheduled a couple of sessions for each participant to introduce what they had been doing to promote reproducibility. This included specific tools for enabling reproducibility (e.g., noWorkflow, ReproZip, yesWorkflow, ROHub, etc.), updates on the state of the art of a particular area (e.g., the work done by the Research Data Alliance, music, earth sciences, bioinformatics, visualization, etc.) and war stories of participants that had attempted reproducing other people’s work. In general, the presentations I enjoyed the most were the war stories. At the beginning of my PhD I had to reproduce an experiment from a paper, and it may involve some frustration and a lot of work. I was amazed by the work done by Martin Potthast (see paper) and Christian Coldberg (see paper) to actually empirically reproduce the work by others. In particular, Christian maintains a list of the papers he and his group have been able to reproduce. Check it out here.
Measuring the information gain
What do we gain by making an experiment reproducible? In an attempt to address this question, we identified the main elements in which a scientific experiment can be decomposed. Then, we analyzed what would happen if each of these components changed, and how each of these changes relates to reproducibility.
The atomic elements of an experiment are the goals of the experiment, the abstract methods (algorithms, steps) used to achieve the goals, the particular method used to implement the abstract algorithm or sketch, the execution environment or infrastructure used to execute the experiment, the input data and parameter values and the scientists involved in the experiment execution. An example is given below:
(R) Research Objectives / Goals: Reorder stars by their size.
(M) Methods / Algorithms: Quicksort.
(I) Implementation / Code / Source-Code: Quicksort in Java .
(D) Data (input data and parameter values): The dataset X from the Virtual observatory catalog
(A) Actors / Persons: Daniel, who design executes the experiment.
The preservation of each these elements of the experiment may change the obtained results. For example, if we change the input data but keep the rest of the parts the same, we ensure the robustness of the experiment (new data may identify new corner cases that were not considered before). If we change the platform successfully but preserve the rest, then we improve the portability of the experiment. In the following table you can see a summary of the overall discussion. Due to time constraints we didn’t alter all the possible columns to represent all possible scenarios, but we represented the ones that are more likely to happen:
There are three main types of actions that you can take in order to improve the reproducibility of your work. These are proactive actions (e.g., data sharing, workflow sharing, metadata documentation, etc.), reactive actions (e.g., a systematic peer review of the components of your experiment, reimplementation studies, etc.) and supportive actions (e.g., corpus construction for reproducibility, libraries of tools that help reproducibility, etc.). These actions affect three different categories: those which involve paper reproducibility (i.e., individual papers), those which involve improving the reproducibility of groups of papers affecting a particular area of interest (like health studies that recommend a solution for a particular problem) and those which involve the creation of benchmarks that ensure that a proposed method can be executed with other state of the art data.
The following figure (extracted from the report draft) summarizes the taxonomy discussion:
Actors in reproducibility and guidelines for achieving reproducibility.
Another of the activities I think it’s worth mentioning on this summary is the analysis part of the group did about the different types of authors that participate in one way or the other in reproducibility, along with the obstacles these actors may find in their path.
There are 6 types of actors in reproducibility: those that create contents (authors, lab directors, research software engineers, etc), those that consume the contents (readers, users, authors, students, etc.), those that moderate the contents (editors), those who examine the contents (reviewers, examiners, etc.), those who enable the creation of the contents (funders, lab directors, etc.) and those who audit the contents (policy makers, funders).
For each of the actors, the group discussed checklists that guided them on how to fully achieve the reproducibility of their contents in three different levels: sufficient (i.e., minimum expectation of the actor regarding the demands for reproducibility), better (an additional set of demands which improve the previous ones) and exemplary (i.e., best practices). An example of these checklists for authors can be seen below (extracted from the report):
Methods section – to a level that allows imitation of the work
Appropriate comparison to appropriate benchmark
Data accurately described
Can re-run the experiment
Verify on demand (provide evidence that the work was done as described)
Ethical considerations noted, clearances listed
Conflicts noted, contributions and responsibilities noted
Use of other authors’ reproducibility materials should respect the original work and reflect an attempt to get best-possible results from those materials
Code is made available, in the form used for the experiments
Accessible or providable data
Engineered for re-use
Published in trustworthy, enduring repository
Data recipes, to allow construction of similar data
Data properly annotated and curated
Executable version of the paper; one-click installation and execution
Making a reproducibility paper publishable
Another cool effort aimed to determine whether reproducibility is a means or an end for a publication. Hence, the group discussed if an effort to reproduce an actual research paper would be publishable or not depending on the available resources and the obtained outcome. Generally, when someone intends to reproduce existing work is because they want to repurpose it or reuse it in their experiments. But that objective may be affected, for example, if the code that implemented the method aimed to be reproduced is no longer available. The discussion led to the following diagram, which discusses a set of possible scenarios:
In the figure, the red crosses indicate that the effort would not have much value as a new publication. The pluses indicate the opposite, and the number of pluses would affect the target of the publication (one plus would be a workshop, while four pluses would be a top journal/conference publication). I find the diagram particularly interesting, as it introduces another benefit for trying to make reproduce someone else’s experiments.
Incentives and barriers, or investments and returns?
The incentives are often the main reason why people adopt best practices and guidelines. The problem is that, in the case of reproducibility, each incentive has also an associated cost (e.g., making all the resources available in an open license). If the cost is excessive for its return, then some people might just not consider it worth it.
One of the discussion groups aimed to address this question by categorizing the costs/investments (e.g. artifact preparation, documentation, infrastructure, training, etc.) and returns/benefits (publicity, knowledge transfer, personal satisfaction, etc.) for the different actors identified above (funders, authors, reviewers, etc.). The tables are perhaps too big to include them here (you can have a look once we publish the final report), but in my opinion the important message to take home is that we have to be aware of the cost of reproducibility and its advantages. I have personally experienced how frustrating is to document in detail the inputs, methods and outputs used on a Research Object that expands on a paper that has already been accepted. But then, I have also seen the benefits of my efforts when I wanted to rerun the evaluations several months later, after I had done additional improvements.
Defining a Research Agenda: Current challenges in reproducibility
Do you want to start a research topic about reproducibility? Here are a few challenges that may help you to get ideas to contribute to the state of the art!:
What are the interventions needed to change of behavior of the researchers?
Do reproducibility and replicability translate in long term impact for your work?
How do we set the research environment for enabling reproducibility?
Can we measure the cost of reproducibility/repeatability/documentation? What are the difficulties for newcomers?
In conclusion, I think the seminar was a positive experience. I learnt, met new people and discussed about a topic that is very close to my research area with experts on the field. I think there could be a couple of things that could be improved, like having a better synchronization with other reproducibility efforts taking place in Dagstuhl or having more representation from the publisher and funding agencies side, but I think the organizers will take it into account for future meetings.
Special thanks to Andy, Norbert and Juliana for making the seminar happen. I hope everyone enjoyed as much as I did. If you want to know more about the seminar and some of its outcomes, have a look at the report!
Apparently September was the month of library conferences. First, the DC-Ipres conference took place during the first week of the month, while the Theory and Practice of Digital Libraries (TPDL) was celebrated from the 22 to the 26th in Malta. I have recently realized that I forgot to add the summary of TPDL, so my highlights can be found below.
In general, the impression that I got is that despite its name, TPDL is a very technology-oriented event. Linked Data was a hot topic, but also user interfaces, mining algorithms, classification, preservation and visualizations approaches were discussed for the library domain. Another curious fact is that many of the talks and papers were related to Europeana project data or models. I had no idea of the size of the project, which is leading to many contributions from a huge amount of institutions all over Europe.
Since there were many parallel sessions, my highlights won’t cover everything. If you want more information you can see the whole program here.
The COST actions for Digital Libraries, which serve to create networks of researchers all over the world.
An Interesting map based visualizations using hierarchies and Eruopeana data with a layer approach (see more here)
The project presented in the session “Using Requirements in Audio Visual Research, a quantitative approach”, which will link together fragments of videos (from a repository of more than 800k hours) and annotate them. I asked the responsible whether the data was supposed to be made available or not, but for the moment it doesn’t look like it. Very cool ideas though, and very useful for journalists and regular users.
The semantic hierarchical structuring of cultural heritage objects done with Eruopeana data to put together resources that refer to the same “thing”, using metadata (for example, to detect duplicates and several different views (pictures) of the same object). Very useful to curate the data, but it lacked a comparison with other clustering methods, which should be done in the future.
The keynote by Sören Auer, where he presented several of the Linked Data aware applications that he and his group had been developing and how they could help librarians in different ways. Ontowiki was the most complete one, a semantic wiki for creating portals and annotating them according to the Linked Data principles (including content negotiation for each of its pages).
The “resurrecting myRevolution paper”, regarding the tweets and links that go missing in the web and how to archive and preserve them properly. This presentation in particular focused on tweets that referenced images that don’t exist anymore (e.g., those taken during the green revolution in Iran).
A nice motivational presentation by Sarah Callaghan on data citation, why we need it and why we should have it. More details here.
The Investigation Research Objects being created in the SCAPE project, based on the foundations settled by wf4Ever and combining them with persistent identifiers like DOIs.
Finally I wouldn’t like to finish without mentioning that the organizers were given the title of Knights of the Digital Libraries, which was very well received by everyone in the conference. Below you can see some of the ceremony, along with one of the Malta’s National library.
In a couple of moths from now, the Beyond the PDF 2 meeting will take place in Amsterdam. This meeting is organized by the Force 11 group in order to promote research communications and “e-scholarship”. Basically, it aims to group scientists from different disciplines in order to create discussion and (among other things) promote enhancing, preserving and reusing the work published as scientific publications.
In the previous Beyond the PDF meeting (2011), Philip E. Bourne introduced the TB-Drugome, an experiment that had taken him and his team a couple of years to finish. The experiment took the ligand binding sites of all approved drugs in Europe and USA and compared them against the ligand binding sites of of the M.tb (Tuberculosis) proteins in order to produce a set of candidate drugs that could (as a side effect from their original purpose) cure the disease.
Philip explained that all the results of the experiment were available online, and asked the computer scientists for the means to expose the method and the results appropriately in order to be reused. His purpose was that other people could use this experiment for dealing with other diseases without spending much effort in changing the method they had followed for the TB Drugome.
And that was precisely my objective during my first internship in the ISI. I was not a domain expert in biology, but thanks to the help of the TB-Drugome authors, we finally reproduced the experiment as a workflow in the Wings platflorm. We also exported it as Linked Data, and abstracted the workflow so as to be able to implement any of its steps with different implementations. An example of a run can be seen here.
As it happens in other domains, workflows decay: the input databases change, the tools are updated/changed, etc. I had to add small components to the workflow in order to make it work and preserve it. The results obtained were different, but consistent with the findings of the original experiment. Another interesting fact is that we quantified all the time we took for reproducing all the steps. This quantification effort gives an idea of how much effort must a newcomer put in reproducing a workflow when the authors are helping, just to give an insight of how big this task can be. If I get a travel grant, I’ll share these results in the Beyond the PDF 2 meeting in Amsterdam :).