One of the main difficulties for using the SPARQL query language is that you have to be familiar with the ontologies or vocabularies in which the data is modeled for performing a query against an endpoint. A couple of weeks ago Machinalis released a new framework called Quepy, which is able to compose SPARQL queries from natural language. A snapshot with an example can be seen below, where I asked for information about Quentin Tarantino:
The proposal is very simple: you write your query and then you can test it against DBpedia, the most popular dataset in Linked Data. The types of queries supported at the moment are limited (simple statements taken from the most popular vocabularies), but I guess they will improve in the near future.
Quepy aims to fill the gap between users and the technology, which in my opinion is precisely what we should do in order to make people start using semantic technologies. However, the release of the framework generated some discussion in the lists. Some people didn’t find this useful, as they argued that the results obtained from the SPARQL queries generated by Quepy were not as accurate as those obtained by Google or Bing. This is true, although I don’t think it’s fair to compare a new released (and limited) framework with the results of two of the biggest companies in the market. Furthermore, the results depend on the dataset on which the queries are performed. Other datasets might give more accurate results, which is part of the beauty of the approach: you don’t depend on a single dataset.
Quepy is a nice initiative, but much work remains to be done in order to map complex natural language queries to languages like SPARQL. The potential of this kind of tool is clear, as it can provide an exact response to what the user is looking for, contrasting it against several distributed databases instead of independent silos of information (like Bing or Google). In our group some people are researching on this topic as well, by analyzing the most typical patterns performed when querying datasets in Linked Data. Mixing natural language processing and common pattern approaches could improve the amount of queries covered by each system separately.