Efficient Temporal Reasoning on Streams of Events with DOTR
Author(s): Alessandro Margara, Gianpaolo Cugola, Dario Collavini, Daniele Dell’Aglio
Full text: preprint
Abstract: Many ICT applications need to make sense of large volumes of streaming data to detect situations of interest and enable timely reactions. The Stream Reasoning (SR) domain aims to combine the performance of stream/event processing and the reasoning expressiveness of knowledge representation systems by adopting Semantic Web standards to represent streaming elements. In this paper, we argue that the mainstream SR model is not flexible enough to properly express the temporal relations common in many applications. We show that the model can miss relevant information and lead to inconsistent derivations. Moving from these premises, we introduce a novel SR model that provides expressive ontological and temporal reasoning by neatly decoupling their scope to avoid information loss and inconsistency. We implement the model in the DOTR system that defines ontological reasoning using Datalog and temporal reasoning using the TESLA Complex Event Processing language, which builds on metric temporal logic. We demonstrate the expressiveness of our model through various examples and benchmarks. We also show that DOTR outperforms state-of-the-art SR tools.
Keywords: Stream Reasoning; Temporal Reasoning; Stream Reasoning Model; Event Streams
Decision: probably accept
Review 1 (by Valeria Fionda)
(RELEVANCE TO ESWC) The authors propose a model that makes use of Semantic Web technologies to perform temporal reasoning on streams of events. (NOVELTY OF THE PROPOSED SOLUTION) The proposed model improve the state-of-the-art reasoning systems. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The proposed model is discussed in details and several examples are provided. (EVALUATION OF THE STATE-OF-THE-ART) The review of the state-of-the-art is adequate. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The proposed approach has been evaluated and compared with the competitors. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The dataset and the system are available online. (OVERALL SCORE) The paper present a framework to perform temporal reasoning on streams of events. The approach use RDF to encode the background knowledge and TESLA to perform temporal reasoning. Incoming events are transformed into temporal annotated RDF graphs and SPARQL is used to retrieve from such graphs the relevant information. The output of the reasoning task is given as a time annotated RDF graphs. The paper is well written and well organized. The topic is interesting and the proposed approach is reasonable. The framework is well explained. Several examples are used through the paper to help the reader to understand what is going on. The system is available online and it has been compared against the state-of-the-art reasoners. If I have to find something to criticize, I suggest to the authors to pay more attention to the figures. Some of them are not very legible if printed in b/w. In Fig1 (a) it is difficult to individuate the two windows. In Fig. 4 the data series of Esper are difficult to see.
Review 2 (by anonymous reviewer)
(RELEVANCE TO ESWC) Stream reasoning is a strongly related to ESWC. (NOVELTY OF THE PROPOSED SOLUTION) The proposed solution is a combination of the Datalog rules, temporal reasoning using the TESLA built on metric temporal logic. The combination seems natural but also not surprising. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The solution seems plausible. However since a formal presentation of DOTR is missing, it is difficult to judge the correctness and completeness. (EVALUATION OF THE STATE-OF-THE-ART) Evaluation of the State-of-the-Art is good but can be improved by discussing the formal properties. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The discussion using examples to motivate are good. However, discussions on the formal properties is large missing. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) Experiments are extensive. (OVERALL SCORE) In this work the authors provide a layered practical stream reasoning system that allows reasoning both over static and temporal data. The authors founded temporal reasoning on top of static ontological reasoning, by decoupling the scopes of reasoning layers. They present the DOTR system via running example. The authors performed extensive experiments comparing DOTR with existing practical stream reasoning systems and presented the results in an elaborative way. However, the main issue of this paper is that a formal presentation of the DOTR model is provided. The authors should provide a formal defintion of DOTR and its semantics. Issues ------ - The DOTR model is not formally defined and therefore the boundaries of the expressivity of the model is not clearly presented in the paper. Although it has been left as a future work by the authors, I think, the formal definiton should be presented even in a sketchy way. For example, as mentioned in , TESLA is as expressive as first order metric temporal logic. However, SPARQL, which is used as a sub-component in DOTR model, and non-recursive safe Datalog with negation have equivalent expressive power . Since the model has not been formally defined, the expressivity of DOTR can not be more or less seen from the existing picture. - Does the DOTR model allows for arbitrary SPARQL queries? This is not really clearly presented in the paper. Moreover, the SPARQL part of the example rule presented in page 6, des not really follow the SPARQL syntax (e.g. ?room2 = Smoke.?room1) - The presentation of the mechanism rewriting TESLA rules into DOTR rules is missing. - Is there any other motivation behind decoupling static and temporal reasoning, apart form exploiting the existing tools? If yes, it is not clearly mentioned in the paper. Such static and temporal reasoning mentioned in the paper can be performed via existing temporalized datalogs in the literature. Why would the authors prefer to define a new specification language? Is benefiting from existing tools, the only reason? - In page 6, the authors very briefly mention aggregates at temporal reasoning level. But it is not specified that what kind of aggregations can be expressed in their model. For instance, is it time point or interval based? Based on time points or intervals, it could really differ how aggregation fuctions are handled in practial side. - The example in page 4 does not well present the problem with the window based models pointed in the same page. The problems defined in item i and ii, can be solved with a window in size 3 and slide 2. Another example would present the problem in a better way. - The email address of Dario Collavini is missing .  - G. Cugola and A. Margara, TESLA: A Formally Defined Event Specification Language.  - R. Angles and C. Gutierrez, The Expressive Power of SPARQL. ----------- After rebuttal We thank the authors for the rebuttal which clarifies some aspects, but also actually confirms some of weakness (e.g. clarity and formalization) we mentioned in the review. If the paper gets accepted, the authors should implement the improvements promised in the rebuttal.
Review 3 (by anonymous reviewer)
(RELEVANCE TO ESWC) The paper fits well the conference and the topic of this subtrack. (NOVELTY OF THE PROPOSED SOLUTION) The novelty of the paper is the combination of an ontological and a temporal reasoner applied to RDF streams. The approach does not make use of window operators, which avoids inconsistencies and loss of information. Its differential is the combination of a windowless approach with a background knowledge reasoning (which is not present in CEP systems). (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) Ontological reasoning is done via Datalog rules, and temporal reasoning is expressed by the TESLA CEP language.The system proposed, DOTR takes care of the adaptation of these two technologies to the scope of RDF streams. (EVALUATION OF THE STATE-OF-THE-ART) The related work covers many of the relevant papers in the area, and a previous survey from authors on RDF stream processing is quite extensive. Nevertheless, a few papers on spatial-temporal reasoning and stream reasoning using ASP are missing.  Thomas Eiter, Josiane Xavier Parreira, Patrik Schneider: Spatial Ontology-Mediated Query Answering over Mobility Streams. ESWC (1) 2017: 219-237  Thu-Le Pham, Alessandra Mileo, Muhammad Intizar Ali: Towards Scalable Non-Monotonic Stream Reasoning via Input Dependency Analysis. ICDE 2017: 1553-1558 (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The approach is well described. The advantage of a windowless approach + background knowledge reasoning is clear in terms of its expressiveness. However, the feasibility of the approach is only partially addressed. Previous attempts on stream reasoning have shown that even though complex queries could be expressed, the systems in practice could not cope with the fast update rates of the incoming streams. While the approach allows for highly expressive queries, the experimental evaluation is limited to rather simple queries. (More comments on the evaluation in the next item) (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The DOTR systems is available on github. For the evaluation the paper relies on the CityBench benchmark for the datasets, queries and the CQELS and C-SPARQL implementation. The evaluation is well explained and the setup is clear. It would have been good to have the queries listed on the paper, but I understand the space limitations. Nevertheless they are all available on github. The evaluation focused on latency, which is crucial in SP systems. It would have been a good addition to also address memory consumption. Given that no windows are used, it would be interesting to see of how much data needs to be kept over the time. As neither the benchmark nor the systems compared to focus on reasoning tasks, this aspect was maybe under evaluated. For instance, authors mention that TESLA supports negation which current SP systems don't (AFAIK). Showing that this can be achieved within reasonable performance times would be a quite interesting and would greatly increase the paper's contribution. (OVERALL SCORE) The paper presents the DOTR systems that combines temporal reasoning with incremental ontological reasoning in the context of RDF streams. Its windowless approach avoids inconsistencies and loss of data, which appear in other systems. In general a nice paper with clear goals. Contribution and novelty could be increased if evaluation addressed reasoning scenarios that current systems can't cope with. SP1: Timely topic, motivation and contribution are clear SP2: Advantages of the approach are clear in a conceptual level. Queries can be very expressive. SP3: System and experimental setup available for reproduction. WP1: Evaluation limited to simple queries and comparison to systems that do not support more complex reasoning tasks. QAs: Could authors clarify why they couldn't compare to approaches that allow more expressive reasoning, e.g. StreamRule? ***** Comments to authors' response ***** I would like to think the authors for providing clarifications on a few items raised by the reviews. I understand the limitations of not having other SR systems available for comparison and had already taken that into account in my original score. The results presented a promising but still lack on demonstrating the feasibility in practice of more expressive queries. Therefore I will maintain my score as weak accept.
Metareview by Maria-Esther Vidal
The paper present the DOTR framework to perform temporal reasoning on event streams represented as time-annotated RDF graphs, taking into account static background knowledge expressed as Datalog rules. The approach uses the TESLA CEP engine to perform temporal reasoning, exploiting a variant of TESLA rules to detect temporal patterns. Such rules have embedded SPARQL queries that are used to extract the facts that hold at each timepoint from the time-annotated data enriched by the background knowledge. The output of the reasoning task is also a time-annotated RDF graphs. The paper discusses the implementation of DOTR on top of existing engines (RDFox and TESLA), and an extensive evaluation and performance comparison with competitor systems. The topic is clearly relevant to ESWC, and the contribution is significant. The paper is well written and well organized, and the framework is well explained be means of an example. However, it lacks a formal presentation of the DOTR model, with its formal semantics, which is also the main weakness of the paper. In the revised version, apart from addressing minor issues raised by the reviewers, the authors should address the following points: 1) Give the intuition of the DOTR formal semantics. 2) Add the references to the missing related work identified by Reviewer 3, with a brief discussion. 3) Briefly discuss the semantics of aggregation (as inherited by the TESLA model), and add an example showing its usage. 4) Add a short explanation about the rewriting of DOTR rules into TESLA rules. 5) Improve the readability of the figures. It is understood that these modification (in particular, item 1), will requires to gain space from other parts. We leave it to the authors to make a proposal on how to gain the needed space.