Author(s): Mohammad Abdel-Qader, Ansgar Scherp, Iacopo Vagliano
Abstract: Vocabularies are used for modeling data in Knowledge Graphs (KGs) like the Linked Open Data Cloud and Wikidata. During their lifetime, vocabularies are subject to changes. New terms are coined, while existing terms are modified or deprecated. We first quantify the amount and frequency of changes in vocabularies. Subsequently, we investigate to which extend and when the changes are adopted in the evolution of KGs. We conduct our experiments on three large-scale KGs for which time-stamped information is available, namely the Billion Triples Challenge datasets, Dynamic Linked Data Observatory dataset, and Wikidata. Our results show that the change frequency of terms is rather low, but can have high impact due to the large amount of distributed graph data on the web. Furthermore, not all coined terms are used and most of the deprecated terms are still used by data publishers. The adoption time of terms coming from different vocabularies ranges from very fast (few days) to very slow (few years). Surprisingly, we could observe some adoptions before the vocabulary changes were published. Understanding the evolution of vocabulary terms is important to avoid wrong assumptions about the modeling status of data published on the web, which may result in difficulties when querying the data from distributed sources.
Keywords: Vocabulary changes; Terms adoption; Deprecated terms