Shutting down the Open Knowledge Graph
The Open Knowledge Graph, an attempt to open up the Google Knowledge Graph by means of crowdsourcing, is history. Since its initial announcement on August 11 to the current day, the graph has grown from 0 triples to exactly 2,850,510 RDF triples. This impressive figure has been reached solely through passionate users who participated in the Search for Embedded Knowledge Items effort (SEKI@home) by sharing their Google search activities. In the view of the authors, there is an over-delivery of facts through knowledge bases like DBpedia or Freebase. In contrast, the Open Knowledge Graph made accessible only a subset of the most interesting facts about entities, derived from the Google Knowledge Graph. This happened in a machine-readable way through the SPARQL protocol.
However, Google clarified for us that by design the data in the Knowledge Graph is available only via a consumer interface. Jack Menzel, Product Management Director at Google, contacted us with the following statement:
"We try to make data as accessible as possible to people around the world, which is why we put as much data as as we can in Freebase. However there are a few reasons we can't participate in your project.
First, the reason we can't put all the data we have into Freebase is that we've acquired it from other sources who have not granted us the rights to redistribute. Much of the local and books data, for example, was given to us with terms that we would not immediately syndicate or provide it to others for free.
Other pieces of data are used, but only with attribution. For example, some data, like images, we feel comfortable using only in the context of search (as it is a preview of content that people will be finding with that search) and some data like statistics from the World Bank should only be shown with proper attribution.
With regards to automatic access to extract the ranking of the content: we block this kind of access to Google because our ranking is the proprietary core of what Google provides whenever you use search—users should access Google via the interfaces we provide."
First, the reason we can't put all the data we have into Freebase is that we've acquired it from other sources who have not granted us the rights to redistribute. Much of the local and books data, for example, was given to us with terms that we would not immediately syndicate or provide it to others for free.
Other pieces of data are used, but only with attribution. For example, some data, like images, we feel comfortable using only in the context of search (as it is a preview of content that people will be finding with that search) and some data like statistics from the World Bank should only be shown with proper attribution.
With regards to automatic access to extract the ranking of the content: we block this kind of access to Google because our ranking is the proprietary core of what Google provides whenever you use search—users should access Google via the interfaces we provide."
In consequence, we are shutting down the Open Knowledge Graph, which means that we will no longer provide access to the data via the SPARQL endpoint previously located at http://openknowledgegraph.org/sparql. We will keep online the SEKI@home Chrome extension for future use (so if you have it installed, please do not uninstall it quite yet), however, will remove the Google-scraping functionality from it. We will also keep the main Open Knowledge Graph homepage (http://openknowledgegraph.org/) online, as our paper titled SEKI@home, or Crowdsourcing an Open Knowledge Graph was accepted for publication at the 1st International Workshop on Knowledge Extraction and Consolidation from Social Media (KECSM2012), collocated with the 11th International Semantic Web Conference (ISWC2012).
The good news is that where there is shadow, there is light: the folks over at Freebase did let us know that the Freebase team have always been committed to supporting the Linked Open Data community and that they have plans in the works on making the Freebase dumps that they already provide available in RDF.
Thanks to all RDF triple scrobblers for contributing to the Open Knowledge Graph. It was fun while it lasted.
Best,
Tom and Stefan
Full disclosures:
(i) This post was reviewed, however, not edited for content, by D. Price, Product Counsel at Google, and J. Menzel, Product Management Director at Google.
(ii) T. Steiner is a Google employee. S. Mirea is a Google intern at time of writing. The Open Knowledge Graph was developed in their own time as an independent research project and with a Universitat Politècnica de Catalunya and a Jacobs University Bremen affiliation respectively.