Paper 118 (Research track)

GSP (Geo-Semantic-Parsing): Geoparsing and Geotagging with Machine Learning on top of Linked Data

Author(s): Marco Avvenuti, Stefano Cresci, Leonardo Nizzoli, Maurizio Tesconi

Abstract: Recently, user-generated content in social media opened up new alluring possibilities for understanding the geospatial aspects of many real-world phenomena. Yet, the vast majority of such content lacks explicit, structured geographic information. Here, we describe the design and implementation of a novel approach for associating geographic information to text documents. GSP exploits powerful machine learning algorithms on top of the rich, interconnected Linked Data in order to overcome limitations of previous state-of-the-art approaches. In detail, our technique performs semantic annotation to identify relevant tokens in the input document, traverses a sub-graph of Linked Data for extracting possible geographic information related to the identified tokens, and optimizes its results by means of a Support Vector Machine classifier. We compare our results with those of 4 state-of-the-art techniques and baselines, on ground-truth data from 2 evaluation datasets. Our GSP technique achieves excellent performances, with the best F1 = 0.91, sensibly outperforming benchmarked techniques that achieve F1 < 0.78.

Keywords: Geoparsing; machine learning; linked data; Twitter

Share on

Leave a Reply

Your email address will not be published. Required fields are marked *