Linking Research

Permanent identifiers and vocabulary publication: purl.org and w3id

Posted by dgarijov on January 17, 2016

Some time ago, I wrote a tutorial with the common practices for publishing vocabularies/ontologies on the Web. In particular, the second step of the tutorial addressed the guidelines for describing how to set a stable URI for your vocabulary. The tutorial referred to purl.org, a popular service for creating permanent urls on the web. Purl.org had been working for more than 15 years and was widely used by the community.

However, several months ago purl.org stopped registering new users. Then, only a couple of months ago the website stopped allowing registering or editing the permanent urls from a user. The official response is that there is a problem with the SOLR index, but I am afraid that the service is not reliable anymore. The current purl redirects work properly, but I have no clue on whether they intend to keep maintaining it in the future. It’s a bit sad, because it was a great infrastructure and service to the community.

Fortunately, other permanent identifier efforts have been hatched successfully by the community. In this post I am going to talk a little about w3id.org, an effort launched by the W3C permanent identifier community group that has been adopted by a great part of the community (with more than 10K registered ids). W3id is supported by several companies, and although there is no official commitment from the W3C for maintenance, I think it is currently one of the best options for publishing resources with a permanent id on the web.

Differences with purl.org: w3id is a bit geekier, but way more flexible and powerful when doing content negotiation. In fact, you don’t need to talk to your admin to do the content negotiation because you can do it yourself! Apart from that, the main difference between purl.org and w3id is that you don’t have a user interface to edit you purls. You do so through Github by editing there the .htaccess files.

How to use it: let’s imagine that I want to create a vocabulary for my domain. In my example, I will use the coil ontology, an extension of the videogame ontology for modeling a particular game. I have already created the ontology, and assigned it the URI: https://w3id.org/games/spec/coil#. I have produced the documentation and saved the ontology file in both rdf/xml and TTL formats. In this particular case, I have chosen to store everything in one of my repositories in Github: https://github.com/dgarijo/VideoGameOntology/tree/master/GameExtensions/CoilOntology. So, how to set up the w3id for it?

  1. Go to the w3id repository and fork it. If you don’t have a Github account, you must create one before forking the repository.
  2. Create the folder structure you assigned in the URI of your ontology (I assume that you won’t be rewriting somebody else’s URI, as if that is the case, the admins will likely detect it). In my example, I created the folders “games/spec/” (see in repo)
  3. Create the .htaccess. In my case it can be seen in the following url: https://github.com/perma-id/w3id.org/blob/master/games/spec/.htaccess. Note that I have included negotiation for three vocabularies in there.
  4. Push your changes to your local repository.
  5. Create a pull request to the perma-id repository.
  6. Wait until the admins accept your changes.
  7. You are done! If you want to add more w3id ids, just push them to your local copy and create additional pull requests.

Now every time somebody accesses the URL https://w3id.org/games/spec/coil#, it will redirect to where the htaccess file points to. In my case, http://dgarijo.github.io/VideoGameOntology/GameExtensions/CoilOntology/coilDoc/ for the documentation, http://dgarijo.github.io/VideoGameOntology/GameExtensions/CoilOntology/coil.ttl for TTL and http://dgarijo.github.io/VideoGameOntology/GameExtensions/CoilOntology/coil.owl for rdf/xml. This works also if you want to do simple 302 redirections as well. W3id administrators are usually very fast to review and accept the changes (so far I haven’t had to wait more than a couple of hours before having everything reviewed). The whole process is perhaps slower than what purl.org used to be, but I really like the approach. And you can do negotiations that you were unable to achieve with purl.org.

Http vs https: As a final comment, w3id uses https. If you publish something with http, it will be redirected to https. This may look as an unimportant detail, but is critical in some cases. For example, I have found that some applications cannot negotiate properly if they have to handle a redirect from http to https. An example is Protégé: if you try to load http://w3id.org/games/spec/coil#, the program will raise an error. Using https in you URI works fine with the latest version of the program (Protégé 5).

Advertisements

9 Responses to “Permanent identifiers and vocabulary publication: purl.org and w3id”

  1. […] Permanent identifiers and vocabulary publication: purl.org and w3id […]

  2. Raquel C. said

    How do you create ontology ‘https://w3id.org/games/spec/coil#’? I mean, I understand the steps you explain but at some point, I got lost. I want to create my own ontology from the very beginning. I think I missed the step you create your video game ontology. It would be great if you can help me with that. Thanks 😉

  3. dgarijov said

    Hi Raquel,
    this post assumes that you already created your ontology at some point. If you want to learn how to create ontologies, I would recommend you to look for a Protege tutorial such as https://www.youtube.com/watch?list=PLea0WJq13cnAfCC0azrCyquCN_tPelJN1&v=R9ERlUgvgwM.
    I hope this helps.

  4. Raquel C. said

    That is so helpful 🙂 Thank you very much! Just one thing, I am currently working with Protege and it provides you a ‘local’ URI but I would like to use a real one. What do I need to obtain one from Github? Just upload my .owl or .rdf? or do I need something else? Sorry for bothering you

    • dgarijov said

      Yes, you would need to upload your rdf to github. However, I think it would be better to create a w3id url in case you want to host your ontology somewhere else (as I mention in the post)

  5. […] anzulegen. Es gab daher Überlegungen, einen Alternativdienst aufzubauen: w3id.org. 1) Vgl. Permanent identifiers and vocabulary publication: purl.org and w3id  jQuery("#footnote_plugin_tooltip_1").tooltip({ tip: "#footnote_plugin_tooltip_text_1", tipClass: […]

  6. ch said

    The Internet Archive runs purl.org now. This seems more reliable than using a commercial infrastructure like Github. They just have to change their business model and w3id has problems.

    Link to Internet Archive blog: https://blog.archive.org/2016/09/27/persistent-url-service-purl-org-now-run-by-the-internet-archive/

    PS: Thanks for your blog, it’s been a very useful resource to me!

    • dgarijov said

      Well, Github has a lot of support from the community. I think it is unlikely that they suddenly change their business model. Besides, w3id uses git, so I guess that in the event of that happening, they can also adopt another approach.

      The internet archive initiative is great. Thanks for posting the pointer. However, I haven’t still been able to access most of the purls I created in purl.org. I hope they manage to add them soon…

  7. […] Permanent identifiers and vocabulary publication: purl.org and w3id […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: