Paper 13 (Resources track)

EventKG- A Multilingual Event-Centric Temporal Knowledge Graph

Author(s): Simon Gottschalk, Elena Demidova

Full text: submitted version

camera ready version

Decision: accept

Abstract: One of the key requirements to facilitate semantic analytics of information regarding contemporary and historical events on the Web, in the news and in social media is the availability of reference knowledge repositories containing comprehensive representations of events and temporal relations. Existing knowledge graphs, with popular examples including DBpedia, YAGO and Wikidata, focus mostly on entity-centric information and are insufficient in terms of their coverage and completeness with respect to events and temporal relations. EventKG presented in this paper is a multilingual event-centric temporal knowledge graph that aims to address this gap. EventKG incorporates over 690 thousand contemporary and historical events and over 2.3 million temporal relations extracted from several large-scale knowledge graphs and less structured sources and makes this information available through a canonical representation.In this paper we present EventKG including its data model, extraction process, and characteristics and discuss its relevance for several real-world applications including Question Answering, timeline generation and cross-cultural analytics.

Keywords: events; multilinguality; knowledge graph; temporal relations; data extraction

 

Review 1 (by anonymous reviewer)

 

The paper describes the construction of event knowledge graph that extracts events, its spatio-temporal context, and other important actors from various sources, normalizes then and makes them available in standard semantic web format. This resource is complementary to event-centric knowledge graphs that  is of contemporary  interest and which can potentially support a variety of applications ranging from time lining  and question-answering to region specific analysis of differences and interpretations.
The resource development seems to have used best practice in terms of reusing existing vocabularies and building on exiating ontologies. It also discusses maintenance and sustainability issues. Overall, the paper is well-written.
Minor issue:
———————
The phrase “temporal relation” normally conveys time-based relationship among events and other concepts, while it has been used in a much broader context causing confusion. The authors should clarify this fact early on and eliminate ambiguity. 
The impact of the paper can be improved  by providing  realistic illustrative examples   of concrete events and reasoning over them.
======
I have read the rebuttal  which did not address the expressed concern.

 

Review 2 (by Marieke van Erp)

 

== 
I have read the authors' rebuttal but I consider my concerns with the paper have not been alleviated. 
==
The paper presents a new resource that is centred on events rather than entities. This EventKG contains information extracted from DBpedia, Wikidata, Wikipedia event lists and the Wikipedia Current Events Portal. The resource looks like a welcome addition to the abundance of entity-centric resources available but I have a few suggestions that would improve the paper:
1. The paper does not discuss the concept "event" extensively and its associated complexities for modelling this information. In Section 3, the paper states "According to the event definition commonly used in the literature, an event is something that happens at a given place and time between a group of actors [3]" However, this definition does not say anything about how big or small an event can be. The paper gives WOII as an example of an event with sub-events, but WOII took place at lots of different locations (unless you state the location of the event as "the world") and over a longer time period. At the other end of the spectrum, the paper gives the example of one of WOII's sub-events "Erwin Rommel arrives in Tripoli" which can also be further split up into events "Erwin Rommel's vehicle crosses the city border" or "Erwin Rommel gets out of the car" or "Erwin Rommel installs himself into his office" The question is how far you want to go and how is the difference in granularity dealt with? An interesting paper that may shed some light on this is: Agata Cybulska, Piek Vossen (2010) Event Models for Historical Perspectives: Determining Relations between High and Low Level Events in Text, Based on the Classification of Time, Location and Participants. In proceedings of LREC 2010. http://www.lrec-conf.org/proceedings/lrec2010/pdf/205_Paper.pdf Do you only model events and sub-events or also other relationships such as “overlaps” as those defined in Allen’s 1983 interval algebra? Furthermore, how does the resource deal with recurring events?  For example London Ribfest, an annual festival that's been held since 1985, the resource only gives the date 1985 (http://eventkginterface.l3s.uni-hannover.de/resource/event_427802 ) This is not surprising as Wikipedia doesn't give any further information, but it is incomplete. Would you want to have separate instances (Ribfest 1985, Ribfest 1986 etc.) or have Ribfest more as a type of event? These are issues that are not easy to resolve, but it would be good to see some more discussion on these in the paper as they do have implication for the resource's usefulness. 
2. The paper states in several places that the EventKG can be useful to different fields such as digital humanities. It would be good if the paper could give some more concrete examples. Also with respect to the point raised in 1) humanities scholars may be good to have in the loop for developing such a resource and getting the community to use it (also semantic web techniques haven't completely permeated the field yet) 
3. I think it would be good if some more space is devoted to explaining how the extraction and linking was performed. From the example in Section 2 and the pipeline in Section 4, it is not clear to me how sub-events are identified. Also, section 4 lists YAGO as a reference source, but then says that "we do not use the YAGO ontology to identify events due to noisy sub-categories of events we observed" So how are events identified? Also, in Section 1, the paper states "from 325,975 events originating from DBpedia [..] only 53.87% are classified using dbo:Event" So how are events found in DBpedia to include in EventKG? Especially since Section 4 states "For each language edition included in EventKG, we identify events from DBpedia by extracting instances of dbo:Event", this seems to contradict the statement in Section 1. It would also be good if the authors could motivate the strength and popularity measure some more, as this may remove interesting pieces of information from the long tail that could be very interesting for users of the KG. 
4. Some more statistics on the different elements of the KG would be nice. For example page 2 states "For example, only 11.81% of the events in Wikidata provide temporal and 33.47% spatial information" How many contain both? Furthermore, in Section 5 numbers are given for how many events contain start times and end times, but how many contain both? This can give a sense of the ‘completeness’ of the information in the resource. Also, in the KG creation process, certain types of information are fused such as Paris, France, Lyon. It would be good to know for how many of the instances this type of operation holds to give a reader/potential user more insight into the setup of the data. Perhaps this could be presented in a graphical manner? 
5. Evaluation of the resource. This is related to the statistics, but how do you know if the information in the resource is correct? There are extraction, matching and fusion operations performed in the creation of the resource, but what if any of the information in the sources is incorrect? 
To address these issues, the background can perhaps be condensed a bit, some of the information that is in there, is also covered again in the related work section, perhaps these can be merged? 
Textual suggestions:
Section 3:
SEM lacks possibilities to directly represent a broad range of relations -> this is a feature, not a bug 
Section 4:
Figure 3 is somewhat small

 

Review 3 (by Fiona McNeill)

 

The paper describes an knowledge graph that focusses on events rather than (as is more standard) entities.  The authors motivate well why this is a useful thing to do and how it draws out and makes explicit information about events that may be implicit or buried in an entity-centric knowledge graph.  It is clear that there are circumstances where easily accessing event information would be very important, and the authors describe these well.  There is a clear description of the relevance of this work to many other communities.
The authors have clearly thought a lot about the principles of Semantic Web resources and have taken care to follow best practice, FAIR data principles, etc..  This should make it easy for others to reuse.  The paper is well written and lays out the motivation, principles and reusability very clearly.  It seems clear that the resource builds on and significantly extends the state of the art.
I have two reservations about the paper.  Firstly, it doesn't seem like the resource is actually being used - although I think it is clearly useful, it's not clear that anyone has taken it up.  That may well be because it is new, but I would like to see some examples of it in use to demonstrate that this really does meet a need.  The other concern I have is that there's no real discussion of the quality of the resource - are they sure that the steps taken in Section 4 create a knowledge graph that is largely accurate?  How have they determined this?  It is great that that is has so many nodes and relations, but if these are not correct that impinges significantly on the usefulness of the resource.  I'd like to see the authors commenting on this.
Although the paper is well-written and the standard of English is good, there could be a few improvements.  I'd like to see a worked example in section 4 to be it clear exactly what each stage of the process is doing.  In section 3, they mention limitations of SEM and illustrate in Figure 2 how EventKG overcomes these, but it's not apparent to me (not knowing SEM) why what they have illustrated isn't possible in SEM and what their approach is doing to improve this.
The word 'popular' is used strangely - for example in the question 'who were the most popular actors involved in Brexit'.  I think they mean what individuals were involved in the most events, but that's not at all how that question would read to a human.
Why does section 3 begin in italics?
Overall, this seems like promising work that the authors have ensured adheres to semantic web principles.  It is easy to access the resource from the website and the website also contains information and sparql endpoints (though I'd like to see more of both of these).  However, I would like to see some discussion of the quality of the resource and some evidence that it is being used.

 

Review 4 (by Antoine Isaac)

 

This paper presents EventKG, a large knowledge base of event data available in 5 languages. 
This has a great potential value to many applications.
The data is openly licensed and available on the web. The extraction software too.
The work done on extracting information and alignment seems state-of-the-art. 
The model seems also very appropriate to the job at hand, both simple for allowing easy re-use of the core data and more complex when it comes to capturing provenance and enabling advanced user to select the source they trust more.
Comparison with related sources (which have been used to build Event KG) is made and demonstrates that EventKG clearly adds something on top of them.
I am however split about recommending this paper for acceptance. The work is impressive in quantitative terms and many methodological aspects, but there is absolutely no evaluation in qualitative terms. How relevant are the events extracted from unstructured sources, or even large knowledge graphs for which many things count as event (e.g. are all ‘occurrences’ in Wikidata relevant)? How good is the fusion of knowledge? Is it missing some of the possible/required merging of events from the original sources (in different languages)? Are the rules to select the dates accurate? How many conflicts are there between EventKG and one of its sources? There is nothing about this. Or course this is hard on such a big dataset and large unstructured source, but one could at least hope to see a bit of insight from the developers of the data.
This reviewer is aware of at least one project in the Digital Humanities, which has similar aims of gathering sources of data on historical data. But that project is hesitant to do this because (1) it’s hard; (2) it is difficult to claim an authority status while the quality of information results is not granted. 
Again, I am aware that EventKG is cautious and keeps track of the provenance. And I applaud that the authors dare to embark on such work. Yet, the question of the authority of the data it contains is crucial for the application scenarios that the paper targets. Researchers in Digital Humanities and builders of interface will re-use Event KG if they know it can be trusted. At least it should be more transparent about potential issues.
Actually the first event I randomly picked at http://eventkg.l3s.uni-hannover.de/data/v1/events.nq shows an issue:
[
<event_216> owl:sameAs <http://www.wikidata.org/entity/Q24912593> eventKG-g:wikidata .
<event_216> owl:sameAs <http://dbpedia.org/resource/Football_at_the_2010_Asian_Games_–_Women> eventKG-g:dbpedia_en .
<event_216> rdfs:label "Football_at_the_2010_Asian_Games_–_Women"@en eventKG-g:wikipedia_en .
<event_216> dcterms:description "The men's football tournament at the 2010 Asian Games was held in Guangzhou in China from 8 November to 25 November."@en eventKG-g:wikipedia_en .
]
Interestingly, this issue seems caused by the Wikipedia source: https://en.wikipedia.org/wiki/Football_at_the_2010_Asian_Games_%E2%80%93_Women. So it’s hard to blame the authors of the paper for this one. Still it shows some fragility of the knowledge base.
Note that the lack of evaluation questions the report on statistics, especially wrt. the coverage of EventKG. A large part of the hundreds of thousand of event the authors claim to add on Wikidata, DBPedia etc. could in fact be attributed to a low-recall event merging strategy. What if the Wikipedia events in the 5 languages reported in table 4 are common but are not integrated well together? This could result in dozens or hundreds of thousand of ‘copies’ of an event that can be said to add to EventKG numbers quite unfairly.
Minor comments
- in the intro (p2) “data models (i.e. SEM)” and “vocabularies “(e.g. DBo and DC)” are quite at the same level
- the text for the application scenarios could be shortened. The ESWC audience will be familiar with the potential value of such KG for question answering and timelines.
- the text on event popularity (p4) and in fact the whole approach seems to mix event popularity and popularity of persons participating an event. It’s not quite the same thing, many historians could say.
- the sentences “Compared to the SEM model[...]. Furthermore [...]” on p6 could be removed as it is quite redundant with other parts of the text.
- I was really surprised by the rule on p10 that selects {Paris, Lyon} from {Paris, France, Lyon}. Some “wide” events (like the French revolution could be said (in the original sources) to happen both in specific places where important sub-events happened, and in France in general. Removing France would lose a lot of information. This is one of the points for which an evaluation would be needed.
- for ranking resources and Wikidata and DBpedia (and the different rankings in different languages), the work at http://people.aifb.kit.edu/ath/ is relevant. I’d be curious to see whether/how it can be integrated in Event KG.
— AFTER REBUTTAL
I thank the reviewers for the answer given to my comments on evaluation. It seems however that the main point has been missed. I am well aware of the fact that provenance is kept, and this is very useful. My problem is with the data *quality* of the whole thing. Relying on the data quality of the original sources is one point, but still one could expect that a source on event that derives data from other sources, and aim to provide this with authority, would check the quality of what’s reused from these sources. See the error I have found in Wikipedia.
There is also a dimension of quality that is inherent to the merging of the sources. That includes  possible conflicts across sources, not within sources - i.e. different sources saying different things. Here the authors’ statements “93.5% of the events agree on the start times across different representations” is a clear step in the right direction. One would want more of this, but there needs to be much more insight given. First, because that sort of statement is still very depending on the matching step. A very conservative approach to matching across sources - say, one that would require matching of begin and end date - would naturally miss many matches, and result in very high statistics like this one (because a low matching rate leads to very few conflicts, basically). This is very possible even if 82% of EventKG events are derived from several sources, considering that the sources provide with very little time data (table 5). 
But on a second thought I am not sure one can understand the figure that is reported. How can 93.5% of 82% of the events agree on start times, if (according to table 5) only 51.21% of the events in EventKG have known time? I remain therefore quite unconvinced.

 

Share on

Leave a Reply

Your email address will not be published. Required fields are marked *