Towards a human readable maintainable ontology documentation
Posted by dgarijov on August 29, 2016
Some time ago, I wrote a small post to guide people on how to easily develop the documentation of your ontology when publishing it on the Web. The ontology documentation is critical for reuse, as it provides an overview of the terms of the ontology with examples, diagrams and their definitions. Many researchers describe their ontologies in associated publications, but in my opinion a good documentation is what any potential reuser will browse if they want to include the ontology on their work.
As I pointed out in my previous post, there are several tools to produce a proper documentation, like LODE and Parrot. However, these tools focus just in the concepts of the ontology, and when using them I found myself facing three main limitations:
- That the tools are in web services external to my control, and whenever the ontology is larger than a certain size, the web service will not admit it.
- That whenever I want to export the produced ontology documentation, it’s not straightforward: I have to download a huge html and it dependencies from the browser.
- That if I want to edit the ontology documentation adding an introduction, diagrams, etc., I have to edit the huge downloaded html. This is cumbersome, as finding the spot where I want to add new contributions is difficult. Normally the edition of the text is mandatory, as some of the metadata of the ontology is not annotated within the ontology itself.
In order to face these limitations, I decided to create Widoco, a WIzard for DOCumenting Ontologies, more than a year ago. Widoco is based on LODE and helps you creating the ontology in three simple steps: introducing the ontology URI or file, completing its metadata and selecting the structure of the document you want to build. You can see a snapshot of the wizard below:
Originally, Widoco produced the documentation offline (no need to use external web services and without a limit for the size of your ontology) and the output was divided in different documents, each of them containing a new section. That way, it was more manageable to edit each of them. The idea here is to be similar to Latex projects, where you include the sections you desire on the main document and comment those you don’t want to include. Ideally, the document would readapt itself to show only the sections you want, dynamically.
After some work, I have just released the version 1.2.2 of the tool, and I would like to comment some of its features here.
- Metadata gathering improvements: Widoco will aim to extract metadata from the ontology itself, but that metadata is often incomplete. With Widoco now it is possible to introduce many metadata fields on the fly, if the user wants them to be added to the documentation. Some of the latest added metadata fields indicate the status of the document and how to properly cite the ontology, including its DOI. In addition, it is possible to save and load the metadata properties as a .properties file, in case the documentation needs to be regenerated in the future. As for the license, if an internet connection is available, Widoco will aim to retrieve the license name and metadata from the Licensius web services, where an endpoint of licenses is ready for exploitation.
- Access to a particular ontology term: I have changed the anchors in the document to match the URI of the terms. Therefore, if a user derreferences a particular ontology term, he/she will be redirected to the particular definition of that term in the document. This is useful because it saves time when looking for the definition of a particular concept.
- Automatic evaluation: If an internet connection is available, Widoco uses the OOPS! web service to detect common pitfalls in your ontology design. The report can be published along with the documentation.
- Towards facilitating ontology publication and content negotiation: Widoco now produces a publishing bundle that you can copy and paste in your server. This bundle is published according to the W3C best practices, and adapts depending on whether your vocabulary is hash or slash.
- Multiple serialization: Widoco creates multiple serializations of your ontology and points to them from the ontology document. This helps any user to download their favorite serialization to work with.
- Provenance and page markup: The main metadata of the ontology is annotated using RDF-a, so the web searchers like Google can understand and point to the contents of the ontology easily. In addition, an html page is created with the main provenance statements of the ontology, described using the W3C PROV standard.
- Multilingual publishing: Ontologies may be described in multiple languages, and I have enabled Widoco to generate the documentation in a multilingual way, linking to other languages on each page. That way you avoid having to run the program several times for generating the documentation in different languages.
- Multiple styles for your documentation: now I have enabled two different styles for publishing the vocabularies, although I am planning to adapt the new respec style from the W3C.
- Dynamic sections: For each section added in the document, the user will not have to worry about their numbering, as it will be done automatically. In addition, the table of contents will change accordingly to the sections the user wants to include in the final document.
Due to the amount of requests, I also created a console version of Widoco, with plenty of options to be able to run all the possible combinations of the features listed above. Even though you don’t need internet connection, you may want it for accessing Licensius and OOPS! webservices. Both the console version and desktop application are available through the same JAR, accessible in the Github: https://github.com/dgarijo/Widoco/releases/tag/v1.2.2
I built this tool to make my life easier, but it turns out that it can be used to make the life of other people easier too. Do you want to use Widoco? Check out the latest release on Github. If you have any problems open an issue! Some new features (like an automated changelog) will be included in the next releases.