Tag: OWL

Permanent identifiers and vocabulary publication: purl.org and w3id

Some time ago, I wrote a tutorial with the common practices for publishing vocabularies/ontologies on the Web. In particular, the second step of the tutorial addressed the guidelines for describing how to set a stable URI for your vocabulary. The tutorial referred to purl.org, a popular service for creating permanent urls on the web. Purl.org had been working for more than 15 years and was widely used by the community.

However, several months ago purl.org stopped registering new users. Then, only a couple of months ago the website stopped allowing registering or editing the permanent urls from a user. The official response is that there is a problem with the SOLR index, but I am afraid that the service is not reliable anymore. The current purl redirects work properly, but I have no clue on whether they intend to keep maintaining it in the future. It’s a bit sad, because it was a great infrastructure and service to the community.

Fortunately, other permanent identifier efforts have been hatched successfully by the community. In this post I am going to talk a little about w3id.org, an effort launched by the W3C permanent identifier community group that has been adopted by a great part of the community (with more than 10K registered ids). W3id is supported by several companies, and although there is no official commitment from the W3C for maintenance, I think it is currently one of the best options for publishing resources with a permanent id on the web.

Differences with purl.org: w3id is a bit geekier, but way more flexible and powerful when doing content negotiation. In fact, you don’t need to talk to your admin to do the content negotiation because you can do it yourself! Apart from that, the main difference between purl.org and w3id is that you don’t have a user interface to edit you purls. You do so through Github by editing there the .htaccess files.

How to use it: let’s imagine that I want to create a vocabulary for my domain. In my example, I will use the coil ontology, an extension of the videogame ontology for modeling a particular game. I have already created the ontology, and assigned it the URI: https://w3id.org/games/spec/coil#. I have produced the documentation and saved the ontology file in both rdf/xml and TTL formats. In this particular case, I have chosen to store everything in one of my repositories in Github: https://github.com/dgarijo/VideoGameOntology/tree/master/GameExtensions/CoilOntology. So, how to set up the w3id for it?

Go to the w3id repository and fork it. If you don’t have a Github account, you must create one before forking the repository.
Create the folder structure you assigned in the URI of your ontology (I assume that you won’t be rewriting somebody else’s URI, as if that is the case, the admins will likely detect it). In my example, I created the folders “games/spec/” (see in repo)
Create the .htaccess. In my case it can be seen in the following url: https://github.com/perma-id/w3id.org/blob/master/games/spec/.htaccess. Note that I have included negotiation for three vocabularies in there.
Push your changes to your local repository.
Create a pull request to the perma-id repository.
Wait until the admins accept your changes.
You are done! If you want to add more w3id ids, just push them to your local copy and create additional pull requests.

Now every time somebody accesses the URL https://w3id.org/games/spec/coil#, it will redirect to where the htaccess file points to. In my case, http://dgarijo.github.io/VideoGameOntology/GameExtensions/CoilOntology/coilDoc/ for the documentation, http://dgarijo.github.io/VideoGameOntology/GameExtensions/CoilOntology/coil.ttl for TTL and http://dgarijo.github.io/VideoGameOntology/GameExtensions/CoilOntology/coil.owl for rdf/xml. This works also if you want to do simple 302 redirections as well. W3id administrators are usually very fast to review and accept the changes (so far I haven’t had to wait more than a couple of hours before having everything reviewed). The whole process is perhaps slower than what purl.org used to be, but I really like the approach. And you can do negotiations that you were unable to achieve with purl.org.

Http vs https: As a final comment, w3id uses https. If you publish something with http, it will be redirected to https. This may look as an unimportant detail, but is critical in some cases. For example, I have found that some applications cannot negotiate properly if they have to handle a redirect from http to https. An example is Protégé: if you try to load http://w3id.org/games/spec/coil#, the program will raise an error. Using https in you URI works fine with the latest version of the program (Protégé 5).

How to (properly) publish a vocabulary or ontology in the web (part 5 of 6)

This week I want to quickly introduce how and why you should include a license in your vocabulary and documentation. Since this subject has been already dealt with, I am mainly going to be providing links to posts describing these matters in detail.

Why should you add a license to your ontologies? Because if others want to reuse your vocabulary or ontology, the license will clarify what are they allowed doing with it according to the law (for instance, if they have to give attribution to your work). Remember that you are the intellectual author and you have the rights over the resource being published. See more details and types of licenses here.

How can you specify a license? You can add it as a semantic description to the ontology/vocabulary. Two widely used properties are dc:rights and dc:license, from the Dublin Core vocabulary. These properties can be used to describe the OWL file being produced, or in the documentation itself with annotations in RDF-a or microdata. See how it can be done here.

Spend some time analyzing which is the most appropriate license for your work. It may help you and many others in the future! If you are confused on which license to use, this is the one which we use on our vocabularies: http://creativecommons.org/licenses/by-nc-sa/2.0/.

This is part of a tutorial divided in 7 parts:

Overview of the tutorial.
(Reqs addressed A1(partially), A2, A3, A4, P1) Publishing your vocabulary at a stable URI using RDFS/OWL.
(Reqs addressed P2, P3). How to design a human readable documentation.
Extra: A tool for creating html readable documentation
(Reqs addressed P4). Derreferencing your vocabulary
(Reqs addressed A1 (partially)). Dealing with the license. (this post)
(Reqs addressed A5, P5). Reusing other vocabularies. (To appear)

How to (properly) publish a vocabulary or ontology in the web (part 2 of 6)

This part of the tutorial explains how to publish your vocabulary at a stable URI using RDFS/OWL. In order to make things easier, I’ll illustrate each step of this part of the tutorial with an example. The steps to follow are further described below:

1) Select the name of your vocabulary/ontology. Easy, right? In my case I want to publish an ontology encoding the workflow motif catalogue we describe in this paper, so the name I have chosen is “The workflow motif ontology” (wf-motifs to keep it short).

2) Select the proper URI to publish your vocabulary. Now that we know how we want to name our vocabulary, things start to get trickier. Which URI do we choose? How do we ensure that it is not going to change?
The URI you choose for your ontology should be permanent and defined in a domain you control. The rationale behind this is simple: imagine that somebody is reusing the concepts defined in your ontology and you change its URI. The person reusing your ontology will no longer know the proper definitions and semantics of the reused term.
Since I assume that most of the people reading this are not willing to pay for a new domain each time a new ontology is published, I recommend defining the URI of your vocabularies/ontologies in http://purl.org. PURL stands for “persistent uniform resource locator”, and they are widely used to give persistent URIs to resources. Once you register in the page, the process is really simple. You define a new domain, wait for the approval and create the URI for your ontology. In my case it is: http://purl.org/net/wf-motifs.

EDIT: If you are interested in having more control on your redirections, w3id is a better alternative to purl. Have a look at my post for more information on how to set it up.

Note 1: If you create the name under the /net/ domain things will go faster, since it is the default domain. Otherwise they’ll have to approve the domain AND the name of your vocabulary/ontology.

Note 2: Someone could argue that by speaking to the system admin of your enterprise/university you can obtain the vocabulary URI as well. However, depending on who you are and the ontology you are working on, the URI they suggest could be something like: mayor2.dia.fi.upm.es/oeg-upm/files/dgarijo/wf-motifs. This is perfectly fine, but this looks more like the place where my .owl will finally be stored. If my file has to be moved, my URI will change. Using purl ensures the URI will be permanent, and that I have control over it.

3) Create the ontology in RDF/OWL: There are several editors to create vocabularies/ontologies and their properties according to the W3C standards: Protégé, the NeOn Toolkit, TopBraid Composer, etc. The one I’m most familiar to is Protégé, which is free to install and use (they say that TopBraid is very good, but since the license is quite expensive I haven’t been able to test it). Once you have installed your editor you just have to change the base URI of the ontology (Ontology IRI in Protégé) with the one you registered as a PURL. Protégé will use a hash (“#”) by default to identify the classes and properties you declare in the vocabulary/ontology. You can use a slash (“/”) for this purpose as well.

Hash versus slash debate: There has been a long discussion regarding the usage of “/” vs “#”. If you are not sure about which one is the best for your vocabulary/ontology, here is a tip: if your ontology will be huge and will be divided in many different modules, use “/”. Otherwise use “#”. It is easier to set up and will make it easier to point to the right spot in the documentation.

Returning back to the example, this is how my ontology IRI looks like:
http://purl.org/net/wf-motifs#
and a sample class will be
http://purl.org/net/wf-motifs#Motif

4) Redirect your permanent URI to your vocabulary/ontology file. Once you are done editing your vocabulary/ontology, you have to host the .owl file somewhere. It is not important where you host it, as long as you know that it won’t be deleted. It’s fine if it gets moved, as long as you know where. In my case, I talked to the system admin and he stored the owl file here:
http://vocab.linkeddata.es/motifs/motif-ontology1.1.owl
Finally, we go back to the purl page and we add the basic redirection to the target URL we have just set up. The form looks like this:

Now whenever we enter the URI of our ontology, it will be redirected to the OWL file. Congrats!
Note: In my case http://purl.org/net/wf-motifs will take you to the ontology if you load it in Protégé and to the documentation if you load it from the web browser. I’ll explain how to achieve that in part 4 of the tutorial, so don’t worry for the moment.

Note: the steps I propose here are not normative. There may be other ways to achieve what is covered here. This is just a possible way to do it.

This is part of a tutorial divided in 7 parts:

Overview of the tutorial.
(Reqs addressed A1(partially), A2, A3, A4, P1) Publishing your vocabulary at a stable URI using RDFS/OWL. (this post)
(Reqs addressed P2, P3). How to design a human readable documentation.
Extra: A tool for creating html readable documentation.
(Reqs addressed P4). Derreferencing your vocabulary.
(Reqs addressed A1 (partially)). Dealing with the license.
(Reqs addressed A5, P5). Reusing other vocabularies.

How to (properly) publish a vocabulary or ontology in the web (1 of 6)

Vocabularies and ontologies have been developed in the last years for modeling different use cases in heterogeneous domains. These vocabularies/ontologies are often described in journal publications and conferences, which reflect the rationale of the design decisions taken during their development. Now that everyone is talking about Linked Data, I have found myself looking for guidelines to properly publishing my vocabularies on the web, but unfortunately the required documentation is scattered through many different places.

First things first. What do I mean by properly publishing a vocabulary? By that I refer to making it an accessible resource, both human and machine readable, with documentation with examples and with its license specified. In this regard, there have been two initiatives for gathering the requirements I am trying to address in this tutorial: the 5-Star Vocabulary requirements by Bernard Vatant and the AMOR manifesto by Raúl García-Castro. Both of these approaches are based in Tim Berners Lee’s Linked Data 5 star rating, and complement each other. In this tutorial (which will be divided in 5 parts), I will cover possible solutions to address each of their requirements, further described below (quoting the original posts).

– Requirements of the AMOR manifesto (A):

(A1) The ontology is available on the web (whatever format) but with an open licence
(A2) All the above, plus: available as machine-readable structured data (e.g., CycL instead of image scan of a table)
(A3) All the above, plus: non-proprietary format (e.g., OBO instead of CycL)
(A4) All the above, plus: use open standards from the W3C (RDF Schema and OWL)
(A5) All the above, plus: reuse other people’s ontologies in your ontology

– Requirements of the 5 start vocabulary principles (P)

(P1)Publish your vocabulary on the Web at a stable URI
(P2) Provide human-readable documentation and basic metadata such as creator, publisher, date of creation, last modification, version number
(P3) Provide labels and descriptions, if possible in several languages, to make your vocabulary usable in multiple linguistic scopes
(P4) Make your vocabulary available via its namespace URI, both as a formal file and human-readable documentation, using content negotiation
(P5) Link to other vocabularies by re-using elements rather than re-inventing.

The tutorial will be divided in 5 parts (plus this overview):

Overview of the tutorial.
(Reqs addressed A1(partially), A2, A3, A4, P1) Publishing your vocabulary at a stable URI using RDFS/OWL. (this post)
(Reqs addressed P2, P3). How to design a human readable documentation.
Extra: A tool for creating html readable documentation.
(Reqs addressed P4). Derreferencing your vocabulary.
(Reqs addressed A1 (partially)). Dealing with the license.
(Reqs addressed A5, P5). Reusing other vocabularies.