Paper 163 (Resources track)

NELL2RDF: Reading the Web, Tracking the Provenance, and Publishing it as Linked Data

Author(s): José M. Giménez-García, Maísa Duarte, Antoine Zimmermann, Christophe Gravier, Pierre Maret, Estevam R. Hruschka Jr.

Abstract: NELL is a system that continuously reads the Web to extract knowledge in form of entities and relations between them. It has been running since January 2010 and extracted over 120 million candidate statements. NELL’s generated data comprises all the candidate statements, together with detailed information about how it was generated. This information includes how each component of the system contributed to the extraction of the statement, as well as when that happened and how confident the system is in the veracity of the statement. However, the data is only available in an ad hoc CSV format that makes it difficult to exploit out of the context of NELL. In order to make it more usable for other communities, we adopt Linked Data principles to publish a more standardized, self-describing dataset with rich provenance metadata.

Keywords: NELL; RDF; Semantic Web; Linked Data; Metadata; Reification; Provenance

Leave a Reply

Your email address will not be published. Required fields are marked *