CEMID- A Semantic Based Framework for Complex Events Modeling and Detection in Multimedia Sensor Networks
Author(s): Chinnapong Angsuchotmetee, Richard Chbeir, Yudith Cardinale, Shohei Yokoyama
Full text: submitted version
Abstract: In Multimedia Sensor Networks (MSNs), gathered readings are enriched with multimedia data which can be used for detecting more complex and application-meaningful events, compared with using solely scalar sensors. However, the main challenge is the difficulty in translating all gathered data into events. We propose CEMID, a semantic-based framework to support Complex Events ModelIng and Detection in MSNs, which relies on: (i) MSSN-Onto, an ontology-based data model for modeling MSNs; (ii) the CEMiD language for defining MSNs and events; and (iii) a semantic-based event processing engine. CEMID allows users to model MSN infrastructure and events, and then dynamically detects events in near real time according to the provided models. We validate CEMID by experimentation. Results show that it can properly detect events in a high-work load scenario with the detection latency of less than one second.
Keywords: multimedia sensor network; atomic/complex event; event processing
Review 1 (by anonymous reviewer)
(RELEVANCE TO ESWC) This paper presents an approach for ontology-based complex event processing in the context of multimedia sensor networks. The paper is definitely relevant for the semantic web community. It fits the state of the art of semantic sensor networks, semantic complex event processing and stream reasoning. (NOVELTY OF THE PROPOSED SOLUTION) The paper proposes a framework that contains the MSSN ontology, a language for defining complex events and an implementation of a dedicated engine. The novelty of the approach is one of the main issues. The MSSN Ontology is the first presented contribution of the work. However, as the authors say in Section 3, the ontology is the result of a previous work . Besides the paper is self-contained, since the ontology is well described in section 4.1, I am not sure it should count as a contribution. Perhaps the CEMID languages, which is based on the ontology itself, should be discussed in more detail to present the scientific value of having a declarative approach that allows describing a problem to solve. Currently, the language does not have a formal semantics. The authors rely on the MSSN ontology abstractions and some known CEP operators but they should reference the intended semantics to characterize the correctness of their approach. Finally, what is the value of reimplementing a complex event processing engine from scratch? The authors admit that a commercial solution like ESPER can achieve better performance. Therefore, they should quantify the effort of implementing a custom solution using ESPER in order to claim that their approach improves the state of the art. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) As I mentioned before, the correctness of the approach is not guaranteed by a formal semantics. Neither it is tested in the evaluation. Therefore is not possible to quantify the approach correctenss/completeness. Nevertheless, the paper presents a minimal running example. The included listings (which are not numbered) are sufficient to show how the approach successfully solves the example. (EVALUATION OF THE STATE-OF-THE-ART) The state-of-the-art section (Section 2) focuses on Sensor Network Modeling, Multimedia Modeling and CEP. The authors should consider also the works related to Semantic Complex Event Processing, e.g., Semantic complex event processing for social media monitoring-a survey R Keskisärkkä, E Blomqvist Event Processing in RDF. M Rinne, E Blomqvist, R Keskisärkkä, E Nuutila Ontology-driven complex event processing in heterogeneous sensor networks K Taylor, L Leidinge Semantic rule-based complex event processing K Teymourian, A Paschke - Towards Ontology-Based Event Processing R Tommasini, P Bonte, E Della Valle Furthermore, I think that for the nature of the problem, the authors should also consider recent works on Stream Reasoning and RDF Stream Processing. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The paper well discusses the CEMID languages and the architecture. It provides several examples and explains all the language features in details. However, I think the explanation lacks proper references that would absolve from the absence of a formal semantics. For instance, the topological operators could be referred to GeoSPARQL OVERLAPS and SEQ could be referred to ETALIS. A side note, the Allen's Algebra is not specifically for complex event processing but is an algebra to perform reasoning about time which was used by logic based CEP engines. Allen's Algebra operators are available in ESPER EPL implementation . (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) Unfortunately, the current implementation is not available so we cannot reproduce the results. The evaluation is well structured but do not rely on any of the existing benchmarks or datasets available. Datasets from the past DEBS challenges of exiting RDF Stream Processing benchmark like CityBench would help to generalize the results.
Review 2 (by Loris Bozzato)
(RELEVANCE TO ESWC) The work presents an application of the Semantic Web technologies to the realization of a MSN framework, thus it is sufficiently relevant to the topics of the conference. (NOVELTY OF THE PROPOSED SOLUTION) The paper proposes some novel contributions: however it does not sufficiently compare them with related approaches in order to understand the novelty of the proposals. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) No apparent technical flaws in the proposed solution are evident from its presentation in the paper. (EVALUATION OF THE STATE-OF-THE-ART) In the discussion of the related works (Section 3), the comparison of the considered approaches w.r.t. the inspected requirements (cf. Table 2) should be expanded: it is not always evident from the discussion in which sense the approach is "partial" with respect to a given requirement. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The main problem with the presentation of the paper stands in the evaluation of the properties of the proposed approach: the experiemental evaluation presented in Section 5 appears to only consider the scalability of the developed system in a test scenario. Thus, it only evaluates part of the work presented in the paper i.e. the implementation of the framework: for example, it is not clear how the event modelling language compares with other related approaches, thus motivating its need. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The experimental study only considers the scalability of the approach w.r.t. different dimensions of the scenario. It does not provide a justification e.g. for the applicability of the approach to different scenarios or the generality of the presented representation of events. Moreover, the system and test data are not available, so that the tests are not directly reproducible. (OVERALL SCORE) SUMMARY: The paper presents an approach and a system for the modelling and detection of complex events in multimedia sensor networks. The authors introduce the CEMID framework in order to offer a comprehensive solution to the challenges provided by the modelling of MSNs, the modelling of their composite events, multimedia data interpretation and event detection in near-real-time. The CEMID framework is presented by introducing its components: MSSN-Onto, an ontology for for modelling MSNs; the CEMiD language, a SPARQL-like language for defining complex events; an event processing system, that uses the proposed ontology and language to detect complex events in near-real-time. For the latter component, an experimental evaluation showing its scalability is provided. STRONG POINTS: - The approach presents a comprehensive solution for complex event modelling and detection in MSN scenarios - The paper is clearly written and the running example helps in uderstanding the application of the approach - The experimental evaluation about the system scalability appears to be promising. WEAK POINTS: - The motivation in developing new solutions for each of the components of the framework should be better explained - The evaluation only covers the event detection system: no assesment is provided for the MSN model ontology or complex event modelling language - The discussion in the limits of the related works should be extended QUESTION TO AUTHORS: - How can you motivate the need for developing MSSN-Onto and CEMiD language? Why current solutions are not enough and why these can not be extended to be used in your framework? - Is there any plan for evaluating MSSN-Onto and CEMiD language e.g. by showing that these can simplify the modelling of complex scenarios? - What are the benefits of using semantic technologies in the presented event detection problem? ----- Added after rebuttal: I acknowledge the authors' comments to reviews provided in their response letter. However, the authors' reply does not help in clarifying the novelty of the paper contributions, the need for some of the proposed solutions (MSSN-Onto and CEMiD language) and outcomes of the experimental results. These aspects need to be addressed in the paper to justify the claimed contributions of the work: considering this, unfortunately, I have to lower my scores.
Review 3 (by anonymous reviewer)
(RELEVANCE TO ESWC) As the paper proposes a semantic framework for the problem of sense making of sensor data the paper is surely relevant to ESWC. (NOVELTY OF THE PROPOSED SOLUTION) Unfortunately, I don't think the proposed solution is novel. Indeed, in my opinion the authors have not carefully reviewed the literature. There is plenty of relevant work in this space, both within the semantic web community and beyond. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The work seems correct. Completeness is difficult to judge here. Obviously, the work is inherently limited to the scenario the authors are proposing, that of a smart office (which strongly relates to smart spaces more generally, e.g. also smart homes). The problem of modeling, detecting and representing complex events in multimedia sensors obviously goes beyond such scenario. It is difficult to tell how well the proposed solution scales out to other domains (say emergency or scientific research infrastructures). Furthermore, the described situations are (as usual for this kind of papers) relatively trivial: "excessive working light" and "overpowered heater". (EVALUATION OF THE STATE-OF-THE-ART) Authors need to substantially improve here. The work is not critically evaluated against the state of the art. Furthermore, I would argue that some statements are factually not correct. For instance, the authors claim that "However, we can support event processing in MSNs, while ESPER cannot". To my understanding, ESPER is designed for complex event processing and supports user-defined functions. Indeed, it should be feasible to implement a face detection algorithm similarly to how the authors argue have implemented in the paper. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The framework is demonstrated reasonably well, at least as far as this is possible in a paper. Surely, I have an issue with simulating the scenario and arguing the the results will transfer into reality. The reality of sensor networks is that sensors fail, provide outliers, drift, etc. A simulated scenario typically doesn't cover such cases and is thus generally far easier a setup compared to a real sensor network. Furthermore, the authors provide no details about some of the more complex aspects, such as detecting a known person. How was this implemented? Finally, a critical discussion of the approach is missing. For instance, what is the benefit of the CEMiD language? It seems that it can be implemented using plain SPARQL. I understand CEMiD is a high level language and abstracts from SPARQL Update queries but authors could discuss the advantages of doing so. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) As far as I can tell, this work is not reproducible. The experiments are simulated and I have my doubts that it can be generalized much beyond the proposed scenario as well as described situations. (OVERALL SCORE) The authors tackle the issue of formulating tasks for detecting complex events in multimedia sensor networks. The problem is interesting and there is a large literature and interesting literature. The authors propose and evaluate (for a simulated scenario of a smart office) a corresponding framework. As strong points, the framework provides a meta-language to formulate events. This is interesting, although the advantages and disadvantages are not well articulated. Overall, the paper is written clearly (but further proofreading would help, e.g. misspelled SQL/SPARL or "people are authorized stuff or not" [staff]) and can be approached by a large audience. The tackled problem is interesting and remains largely unresolved. As weak points, the framework was tested only for a simulated scenario. I encourage authors to evaluate the framework on real world scenarios with real sensors. It is by far more complicated. The literature review is weak and I would argue that there are studies that address the listed requirements (whether we need a meta-language to model MSNs is not clear). It is unclear what the novelty is. Surely, the SSN ontology is not limited to scalar values. The Result of an Observation may very well be a satellite data product. If nothing else, such a product can be identified by a URI which could be the Result of an Observation in SSN. IMO, scalar data is one kind of multimedia sensor. It is thus confusing to argue that "... where data gathered from multimedia and scalar sensors ..." (Section 2). Is the MSSN-Onto published? Is the framework published? Why not use GeoSPARQL for floor plan and coverage area modeling? GeoSPARQL providers for the modeling of features and geometries. What motivates the different concept for Sensor and MediaSensor? Isn't MediaSensor a Sensor? Section 4.2: "Sensor types that are created by this command are treated as ABox instances which will be later used for creating TBox instances through INSERT SENSOR command". I think, this sentence gets ABox and TBox confused. To my understanding it should be opposite. Sensor types are created as TBox axioms while these are used to create ABox assertions through inserting concrete sensor instances. How is Fig 5 (a) explained? I would expect that at some point the resources of a single machine are depleted. Rather than showing a flat line, it would be more interesting to evaluate the system so to push the limits, and show that the detection latency at some point does increase. Please provide the (sampling) frequency in Hz, not ms. Indeed, Fig 5 (b) is very confusing and I suspect it is wrong. Sampling frequency of 1000 ms translated to 1 Hz while 200 ms to 0.2 Hz. I would expect the line to go opposite, higher with higher frequency? The URL in footnote 9 misses the protocol, http. === FEEDBACK TO REVISION === Thank you for addressing reviewer comments. I am still confused about the author's insistence that there are no studies on (semantic) CEP for "multimedia sensor networks". IMO, this is not correct. Just digging one example  from an author I have read work in the past shows that there is relevant related work in this space. Yes, the authors might not call it "multimedia sensor networks" but most environmental sensor networks are in fact "multimedia" and handle different kinds of data. There is also relevant work from the streamed RDF/SPARQL community (e.g. what Jean-Paul Calbimonte et al. did in the Swiss Experiment ). Also, on the point that "We propose MSSN-Onto because SSN ontology of W3C lacks of important vocabularies for modeling multimedia sensor networks": How about extending the SSN ontology with the required vocabulary? IMO it makes more sense than proposing yet another ontology for the sensing domain. Futhermore, I think classical CEP (e.g. ESPER) does more than the authors admit here. Yes, modelling sensor networks is beyond the scope but it very much supports implementing complex event detection on heterogeneous data. Finally, and perhaps this is just me, frequency in ms rather than Hz  confuses me. As far as I can tell, the authors did not address this point.  https://dl.acm.org/citation.cfm?id=2254161  https://doi.org/10.1145/1416729.1416732  https://en.wikipedia.org/wiki/Frequency#Units
Review 4 (by Edward Curry)
(RELEVANCE TO ESWC) Very relevant as the focus is semantics (NOVELTY OF THE PROPOSED SOLUTION) Semantification of existing approaches. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) Looks correct. Has an experimental validation. (EVALUATION OF THE STATE-OF-THE-ART) Covers the main related works. However, there is room to improve the analysis. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) An experiment has been run with some discussion. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The experiment is well described. Some parameters are missing. Difficult to reproduce without code being released (OVERALL SCORE) The paper tackles a very relevant problem and provides a framework for handling multimedia data challenges using complex event modelling in multimedia sensor networks (MSN). The main contribution of the paper include (i) MSSN-Onto: Ontology-based data model (ii) CEMiD: Language for defining events in MSNs (iii) Complex Event Processing Engine to support multimedia data. The evaluation demonstrates the efficiency of the proposed system with an increased working load. The paper reads well. Comments for improvement =State of the Art= - How were the requirements defined? - It is not clear why the ontology-based multimedia data modelling techniques cannot be used for modelling MSN infrastructure and events (Section-3 (ii))? - Table-2 Analysis: No discussion of “Partial” in Sensor network Modelling and also in Multimedia Data Modelling. Need more clarification. Would they be sufficient for the approach? If not, why not? - Confused heading Complex Event Modeling. Either it should be “Semantic-based Event Modelling” then its entry should it not be “Yes” Or replace “Partial” with “Scalar data only”. ? =Evaluation= - It is very interesting to know the performance of the proposed system by varying number of sensors, sampling frequency and the number of operators, and I would also be interested in knowing why and how they are important to justify the performance. What would be the expected performance in a real-world scenario? - what are the four operators ? Why not more/less? - Comparison of the proposed system with commercial event detection engines like “ESPER” without specifying configurations. Further details required. - Figures are very difficult to read. (size) Strong Points: - Strong Motivation: The paper provided a good motivation of analysis of multimedia along scalar data using single example throughout the paper. - Framework: Novel approach for the abstraction of the user from reimplementation using dynamically modifying the MSN infrastructure, applications and events according to the application domain. - Readability: High-quality explanation of the proposed architecture with clear dimensions of contributions relying on (i) MSSN-Ontology (ii) CEMiD Language and (iii) Complex Event Processing Engine. Weak Points: - Identification of Gap: Missing discussion on limitations of existing systems, why they are not handling these challenges? In other words summary regarding limitations of existing approaches. - Requirements and state of the art analysis could be improved - Evaluation: Discussion and implications of the results Questions to the Authors: - The proposed approach is an enhancement of a complex event-processing engine with the provision of multimedia modelling using novel CEMiD query language. I wonder why it is developed from scratch? (Justification for not extending existing event processing engine is necessary, what are their limitations that are stopping us from extending existing event processing engines) - What is the motivation for using “Floor Plan/Coverage Area Modelling” in Section 4.1? Why is this a contribution? Was SSN not design specifically to allow such domain extensions? Response to Authors: Thank you for the response. The justification for why a new CEP engine is needed is not clear to me. There needs to be a clearer analysis of existing CEP systems and why they do not provide the necessary capability. Since the ontology has been recently accepted in FGCS (congratulations), I am not sure what contribution this paper now makes. For these reasons, I have changed my decision to a reject. Good luck with your future work.
Metareview by Maribel Acosta
This work addresses the problem of Complex Event Processing (CEP) in the context of Multimedia Sensor Networks (MSN). To tackle this problem, the authors propose CEMID, a semantic framework composed of: the MSSN-Onto ontology, the CEMiD language, and a event processing engine. The empirical study presented in the paper compares the performance of two operators of the proposed solution. The reviewers agreed that the problem tackled in this work is relevant to the Semantic Web. Nonetheless, after an extensive discussion, the reviewers have identified several major issues with the foundations of this work. Mainly, the components of the proposed solution disregard the results reported in the literature: The authors should provide either theoretical or empirical evidence that current solutions are not sufficient for addressing the task at hand. Unfortunately, without a proper comparison with related work, the novelty and research contributions of this work remain unclear. Lastly, the reviewers have provided detailed suggestions to address the issues highlighted during the evaluation process. We highly encourage the authors to consider these suggestions to improve the overall quality of this work.