Category: Research Object

Elevator pitch

While being a PhD student, many people have asked me about the subject of my thesis and the main ideas behind my research. As a student you always think you have very clear what you are doing, at least until you have to actually explain it to someone who is not related to your domain. In fact, it is about using the right terminology. If you say something like “Oh yeah, I am trying to detect abstractions on scientific workflows semi-automatically in order to understand how they can better be reused and related to each other”, people will look at you as if you didn’t belong to this planet. Instead, something like “detecting commonalities in scientific experiments in order to study how we can understand them bettermight be more appropriate.

But last week the challenge was slightly different. I was invited to give an overview talk about the work I have been doing as a PhD student. And that is not only what I am doing, but why am I doing it and how is it all related without going into the details of every step. It may appear as an easy task, but it kept me thinking more than I expected.

As I think some people might be interested in a global overview, I want to share the presentation here as well: Have a look!


The last 3 years I have been involved in the Wf4Ever project, which has developed the notion of Research Objects and their respective models (previously introduced another post). Lately I have been exploring new ways for eating my own dog food by associating Research Objects to my papers as HTML web pages (see an example here). These Research Objects are useful, as they serve as summary for the paper in question, and they have pointers to all the datasets, queries and additional materials that I could not include in the paper.

However, I realized that I spent a lot of time creating them and annotating them. Therefore during last Christmas I have created a Research Object Creator tool, which takes as input a LaTeX file and extracts its title and abstract to create an annotated page in rdf-a. It also produces a structure of the contents to reference, so you only have to fill in (and annotate if you want) the resources to point to. A sample can be seen in the image below:

A Sample Research Object generated by the tool

The tool is available in Github, so if you want to try it out with a LaTeX paper click on the following link:

Finally, I have also created a landing page for showing the current catalog of Research Objects: The page is generated automatically and given a URI of a Research Object, it extracts its title and abstract from the rdf-a descriptions. If you want to contribute with new URIs, modify the Constants file in the Github project ( and I will recreate the landing page. Note that for this project I have used the Semargl rdf-a parser (, which is a little bit strict when parsing the HTML pages. If your Research Object has any markup mistakes, the parser will fail.