Optimizing Semantic Reasoning on Memory-Constrained Platforms using the RETE Algorithm
Author(s): William Van Woensel, Syed Sibte Raza Abidi
Full text: submitted version
Abstract: Mobile hardware improvements have opened the door for deploying rule systems on ubiquitous, mobile platforms. By executing rule-based tasks locally, less re-mote (cloud) resources are needed, bandwidth usage is reduced, and local, time-sensitive tasks are no longer influenced by network conditions. Further, with data being increasingly published in semantic format, an opportunity arises for rule systems to leverage the embedded semantics of semantic, ontology-based data. To support this kind of ontology-based reasoning in rule systems, rule-based axiomatizations of ontology semantics can be utilized (e.g., OWL 2 RL). Nonetheless, recent benchmarks have found that any kind of ontology-based reasoning on mobile platforms still lacks scalability, at least when directly re-using existing (PC- or server-based) technologies. To create a tailored solution for resource-constrained platforms, we propose changes to RETE, the mainstay algorithm for production rule systems. In particular, we present an adapted algorithm that, by selectively pooling RETE memories, aims to better balance memory usage with performance. Further, we show that this algorithm is well-suited towards many typical Semantic Web scenarios. Using our custom algorithm, we perform an extensive evaluation of semantic reasoning both on the PC and mobile platform.
Keywords: RETE; OWL2 RL; rule-based reasoning; OWL reasoning; reasoning optimization
Review 1 (by anonymous reviewer)
(RELEVANCE TO ESWC) - (NOVELTY OF THE PROPOSED SOLUTION) - (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) - (EVALUATION OF THE STATE-OF-THE-ART) - (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) - (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) - (OVERALL SCORE) The paper investigates the usage of (variants of) the RETE algorithm to realize rule-based reasoning under constrained resources. It focuses on the OWL profile OWL 2 RL where reasoning can be realized in a rule-based fashion. The authors describe several versions of the RETE algorithm which differ in the way in which memory is assigned to the nodes. They perform a comprehensive evaluation, measuring a variety of metrics. Overall the paper is nicely written, clearly motivated, and technically sound; topic-wise it is a good fit for ESWC. The evaluation is very thorough, but I have the impression that the reader is lost amidst the many results presented. I would like to urge the authors to put some effort into better aggregating the results, using diagrams to provide an overview of the findings and single out the main lessons learned from the experiments.
Review 2 (by Ana Roxin)
(RELEVANCE TO ESWC) The paper at hand presents a new reasoning algorithm (RETEpool) along with its exemplification and evaluation. These items correspond tot he topics defined for the ESWC "Reasoning" track. (NOVELTY OF THE PROPOSED SOLUTION) The approach at hand is based on the RETE algorithm. It consists in extending this algorithm by pooling a particular selection of RETE alpha memories. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The paper is well written, the experiment conducted is well described, benchmarks are provided in order to attest the performance reached by this "new version" of the RETE algorithm. (EVALUATION OF THE STATE-OF-THE-ART) The algorithm at hand was specifically conceived to deal with the data duplication generated by OWL 2 RL generic rule premises. Thus the "Related Work" section is focused solely on algorithms dealing with OWL2 RL reasoning. Authors evaluate different versions of their algorithm, that differ in the type of memory (regular, virtual with differents thresholds, shared) they use for the beta nodes. Unfortunately, the authors do not compare the results of their algorithm with other existing state-of-the-art algorithms. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) Authors extensively describe the expriments perfomed, without "hiding" any details. They also include a discussion about the limitations of their evaluation namely using only OWL 2 RL rulesets. Future work is correctly identified based on current limitations. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The experiment conducted is well explained in terms of benchmark setup and benchmark results. Notably, the benchmark setup section extensively describes the different components used for the experiment: the baseline system, the OWL 2 RL ontologies and ruleset, the benchmark platforms, the RETEpool configurations along with the benchmarks metrics defined. As a suggestion, results displayed in Table 1 could be highlighted in terms of "best performance achieved" ? For example, in a PC context, it's th RETEbase version that achieves the best reasoning performance. Thus the result achieved (e.g. "15705") could be in bold in order to improve the readability of the overall results. (OVERALL SCORE) Summary: The paper at hand describes a novel reasoning algorithm, called RETEpool, based on the pooling of particular RETE alpha memories. The performance of the algorithm was evaluated through extensive benchmarks, and a solid related work section clearly positions this contribution in the considered research domain (OWL2 RL rule reasoning over Semantic Web resources). SPs: - The overall approach is well justified, explained and commented - The approach focuses on reducing data duplication in alpha memories - The paper is well-written, the experiments are well described WPs: - Authors do not compare the results of their algorithm with other existing state-of-the-art algorithms. QAs: - When considering other types of rulesets, do you think applying a rule selection algorithm (e.g. https://www.sciencedirect.com/science/article/pii/S0169023X15000713) might help in reducing data duplication ? - Does the type of reasoning performed impact the overall performance of the algorithm e.g. backward- against forward- or hybrid chaining reasoning approaches ?
Review 3 (by anonymous reviewer)
(RELEVANCE TO ESWC) The topic of reasoning in an OWL 2 profile (RL) is certainly relevant. (NOVELTY OF THE PROPOSED SOLUTION) The RETE algorithm is well-known; the proposed optimizations are sensible but do not look very surprising. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The proposed solution is correct. (EVALUATION OF THE STATE-OF-THE-ART) Other systems could have been included in the comparison. In particular, it is not clear why RETE even with all of these optmizations would be the most memory efficient. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The approach is discussed in great detail. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The experiments are available online. (OVERALL SCORE) The presents a RETE-based OWL 2 RL reasoning algorithm and studies some optimizations. The main motivation is reasoning on memory-constrained devices. ** Strong Points (SPs) ** 1. Reasoning on memory-constrained devices is an interesting topic. 2. The devised optimizations appear to work. ** Weak Points (WPs) ** 1. RETE is a classical algorithm, which operates in a bottom-up fashion by deriving all consequences. A comparison with datalog engines (e.g., RDFox) or goal-oriented (top-down) systems is missing. It should be noted that SNOMED-CT, for example, is not in OWL 2 RL. Gb -> GB "multiple OWL 2 profiles" - there are only three of them "does not guarantee completeness for TBox reasoning" - it is not clear why *TBox* reasoning is emphasized here *after rebuttal*: It looks like the authors confuse here the OWL 2 RL *ruleset* from the specification (which may not guarantee TBox reasoning) with the OWL 2 RL profile, which is a sublanguage of OWL 2 DL, and as such has systems that guarantee TBox reasoning. The list of the three items on p 3 is a way too technical and is quite difficult to understand unless one knows all the nuances of OWL semantics (this, in particular, applies to item 3). The phrase "each annotation property has type owl:AnnotationProperty" is a bit unexpected - how does one know that 😛 is an annotation property if it does not have type owl:AnnotationProperty? *after rebuttal*: The OWL 2 RL refers to a fixed set of *built-in* annotation properties (so, these are facts rather than an axiom). The title of Section 3 mentions reasoning on RDF - what sort of reasoning would one do on RDF? IntersectionOf does not exists in RDF. Is it not OWL 2 RL? Section 4.2 is a bit difficult to follow. In particular, the explanations on the number of joins may need more comments.
Metareview by Diego Calvanese
The paper studies the use of different variants of the RETE algorithm to realize rule-based reasoning for the OWL2RL of OWL2, which supports a rule-based inference approach. The main motivation for the paper is reasoning on memory-constrained devices, which is an important topic. There is agreement that the paper is well written, that the overall approach is well-justified, and that the evaluation carried out is thorough. The main weakness is the lack of a proper comparison with alternative state-of-the-art algorithms, such as RDFox or goal-oriented approaches.