Dynamic Tailoring of RETE Networks in Incremental Scenarios
Author(s): William Van Woensel, Syed Sibte Raza Abidi
Full text: submitted version
Abstract: Decision support systems, with production rule systems at their core, have an opportunity to leverage the embedded semantics of semantic, ontology-based data to improve decision support accuracy. Advances in mobile hardware are enabling these rule-based systems to be deployed on mobile, ubiquitous platforms. By deploying reasoning processes locally, time-sensitive tasks are no longer influenced by network conditions, less bandwidth is wasted, and less re-mote (costly) resources are needed. Despite hardware advances however, recent benchmarks found that, when directly re-using existing (PC- or server-based) technologies, the scalability of reasoning on mobile platforms is greatly limited. To realize efficient semantic reasoning on resource-constrained platforms, utilizing rule-based axiomatizations of ontology semantics (e.g., OWL 2 RL), which are known to trade expressivity for scalability, is a useful first step. Furthermore, the highly dynamic nature of mobile and ubiquitous settings, where data is typically encountered on-the-fly, requires special consideration. We pro-pose a tailored version of the RETE algorithm, the mainstay algorithm for production rule systems. This algorithm dynamically adapts RETE networks based on the evolving relevance of rules, with the goal of reducing their resource consumption. We perform an evaluation of semantic reasoning using our custom algorithm and an OWL2 RL ruleset, both on the PC and mobile platform.
Keywords: RETE; OWL2 RL; rule-based reasoning; OWL reasoning; reasoning optimization
Review 1 (by anonymous reviewer)
(RELEVANCE TO ESWC) The article depicts optimizations of the Rete algorithm for forward inference in rule based systems, and the approach is empirically evaluated over the OWL 2 RL Benchmark Corpus. Both seem relevant to ESWC. (NOVELTY OF THE PROPOSED SOLUTION) According to the authors themselves, the optimizations are similar to [Doorenbos 95] (and to some extent to [Miranker 87]), and a detailed comparison is missing to appreciate the novelty of the approach. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) Completeness is claimed to be preserved, but no justification is given for the case of fact retraction (see the full review below). (EVALUATION OF THE STATE-OF-THE-ART) Pointers to the literature are relevant, but the analysis of the state-of-the-art needs to be more detailed. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The optimizations are very clearly described and illustrated. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) Thanks to a precise description of the algorithm, and to the publication of the ruleset and data, the evaluation should be reproducible. The main weakness of the evaluation is the chosen baselines (see the full review below). (OVERALL SCORE) The article proposes a series of optimizations of the Rete algorithm, one of the most popular algorithms for forward inference in rule based systems. The optimizations are meant to target resource-constraint platforms (such as mobile phones), and incremental reasoning, i.e. scenarios where additional facts (or fact retraction) may be dynamically fed to the inference engine. The optimizations prevents attempting some joins which will necessarily fail, because the memory of one of the join arguments is empty. This is performed by "unlinking" alpha nodes from (the Rete graph of) certain rules, or further "pausing" them if they have been unlinked from all rules in which they appear. A triggering mechanism allows relinking/reactivating alpha nodes once the join operations have a chance to succeed again. As a side effect, if a paused alpha node never resumes execution, unnecessary matches will be avoided. A detailed description of the whole unlinking/pausing/resuming mechanism is provided, including an algorithm, called Rete_tailor. The strategy is then empirically evaluated over the OWL 2 RL Benchmark Corpus, together with a partially handcrafted ruleset, available online, in addition to helpful links and samples. The two baselines are a naive implementation of the Rete algorithm, and a simple a priori tailoring strategy, which consists in running the naive Rete over a fragment of the dataset, discard unused rules, and then perform (incomplete) inference over the remainder of the dataset. The article is particularly well written, clear, and largely self-contained. The more technical aspects are introduced progressively, and illustrated with examples. A precise description of the algorithm, together with the publication of the dataset, make the experiment reproducible. Possible limitations (such as the need for storing tokens in memory for paused alpha nodes) are explicitly pointed out, as well as similarities to optimizations proposed in the literature. My main reservations are: 1) An insufficient positioning/analysis of the contribution. Rete-based forward inference is a relatively mature technology. A number of improvements of the naive algorithm have already been proposed/evaluated, and cost-models have even been designed (see for instance Wright et al. 2013 below). In this context, a finer-grained (theoretical and/or empirical) comparison of Rete_tailor with existing optimizations is expected. In particular, the parallels drawn in Section 5 are too high-level to appreciate the novelty of the approach. It is argued to be well-suited for resource-bounded scenarios and incremental reasoning, but the article does not explain why previous optimizations of the Rete algorithm may not. 2) The baselines chosen for the empirical evaluation are arguably not appropriate. Comparing Rete_tailor to the naive Rete algorithm is not very informative, as the latter has long been replaced in practice by optimized versions. As for the other baseline, it does not guarantee completeness. So the performance comparison (number of join/match actions) is not only unfair to Rete_tailor, but also somewhat irrelevant, as both systems may not yield the same output. Instead, one would expect a comparison with some of the optimizations proposed for instance in [Doorenbos 95], with which Rete_tailor has a lot in common, according to the authors. Other concerns are the following: 3) Rete-based fact retraction is mentioned several times, but never (even briefly) described. Retracting a fact is arguably a key aspect of incremental reasoning, which is the application scenario targeted by the authors. To my (limited) knowledge, Rete-based retraction consists in applying the inference mechanism in a reverted fashion. But it is unclear how the proposed optimizations behave in scenarios where facts are both added and retracted. In particular, can completeness be affected by the interaction between asserting/retracting a fact and pausing/resuming alpha nodes? The answer may be obvious, but a short justification would help. 4) Although OWL 2 RL can express disjointness, the assumption seems to be made that the input dataset (data + ruleset) is consistent. Maybe this could be explicitly said. ### Questions ### - Page 5, Section 3.2, definition of h.1: "and \alpha memories in case i <= 2; Fig. 1". I could not find an "i" in Figure 1. Is it the index of the alpha node? - Section 5: How does Rete_tailor compare to [Özacar et al. 2007]? ### Suggestions ### - Page 7, Section 4.1.4 7, T.iii: "to avoid redundant join and match operation": "redundant" is possibly misleading. Intuitively, redundancy means matching twice the same token (at different times), or joining twice over the same pair of tokens. I suggest something more explicit, like "avoid necessarily failing join/match attempts". - Page 7, Section 3.4: "Code 1" -> "Algorithm 1" - Page 8, second paragraph: "previously deemed redundant". Same remark as above for "redundant". - Section 5 helps understanding the intended application context, as well as the specificity of the approach, in particular to what extent the proposed work differs from [Doorenbos 95]. Therefore it may appear earlier in the article. - Bibliography: at least one of the articles of Forgy (author or the original Rete algorithm) should be mentioned (e.g the one referenced below). ### Typos ### - Figure 1: t3 is used instead of t2 - Page 4, 2nd paragraph: "Afterwards, the tailored". Part of the sentence is missing. - Page 7, 3rd paragraph: "is longer empty" -> "is no longer empty" - Page 7, Section 4.1.4 7: "took place the original dataset". Missing preposition. - Page 7, case "t_x > t_y": "to beta node \beta_1(network1 > network2)". Seems like what is meant here is "(network1 > network3)", as network2 is never reached in the case t_x > t_y. - Page 12: avoid splitting Table 2 over 2 pages. ### References ### Forgy, Charles L. "Rete: A fast algorithm for the many pattern/many object pattern match problem." Readings in Artificial Intelligence and Databases. 1988. 547-559. Wright, Ian, and James AR Marshall. "The execution kernel of RC++: RETE*, a faster RETE with TREAT as a special case." Int. J. Intell. Games & Simulation 2.1 (2003): 36-48. --- After rebuttal Thanks to the authors for their response letter. Please find below my comments about it. 1) "one requires a comparison to the baseline system": fair enough. A comparison to the standard Rete seems indeed relevant. My point was that it is not sufficient. 2) "An explicit comparison with the optimizations from Doorenbos  seems unsuitable, since most of these are included in our work": I have to disagree with this statement. I was precisely expecting a comparison between optimizations from Doorenbos  (at least the ones implemented by the authors) and the Rete_tailor algorithm. Currently, the experiments do not specifically evaluate the new optimizations proposed in the article (namely pausing, and the lower-failed alpha node heuristic). 3) I also still disagree with the relevance of the other baseline (a priori tailoring). The evaluation compares performances (token matches, join attempts, reasoning time) of the different algorithms. Therefore they should produce the same output. To take an analogy, imagine comparing query evaluation times of two SQL engines. This is arguably pointless if one of them returns incorrect or incomplete answers. Put another way, one can only expect a qualitative comparison of Rete_tailor and a priori tailoring (number of inferred triples, ...). 4) "other approaches with the same goal" and "the most relevant work in the field": I may be missing something obvious here (apologies if I am). Which "field" exactly are we talking about? If I understand correctly, the field is "Rete-based forward chaining in incremental scenarios", as opposed to "Rete-based forward chaining in non-incremental scenarios". And according to the authors, the most relevant work in the former field is Tai . But what exactly makes the incremental case different? Section 3.3 paragraph 2 is not sufficient for me to understand it. More exactly: - Let us first assume that no fact is retracted. Which optimizations are better-suited to the incremental case, and why (is this due to additional parallelization opportunities in the non-incremental case)? - Now let us assume that facts can be retracted. Are there additional differences between incremental and non-incremental? And if so, why is the literature about deletion (in the non-incremental case) not relevant?
Review 2 (by Wouter Beek)
(RELEVANCE TO ESWC) While reasoning is an important use case on the Semantic Web, I am not sure whether the adaptation of reasoners for resource-constrained environments is specifically relevant. The desktop and mobile systems that are used in the evaluation are already almost comparable in hardware specs. There may only be a very brief window (two to five years) in which mobile and desktop devices have access to different computational resources. It certainly would have helped to have at least some discussion on the mid-to long-term utility of this work in the paper. I can imagine some additional long-term applications in an IoT context, but the paper does not mention these at all. (NOVELTY OF THE PROPOSED SOLUTION) The proposed approach does include non-trivial extensions of an existing approach and of existing algorithms. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The approach is explicitly preserving completeness of entailment results (as opposed to a specific existing approach), and completeness is also evaluated WRT to multiple ontologies. (EVALUATION OF THE STATE-OF-THE-ART) Does Table 1 show the total number of unlinking/relinking operations over all ontologies? Is this not a very small number / does this even matter when compared to the total number of operations? In the text the general applicability is explained to be larger than these numbers suggest: “it is likely that [real-world scenarios] will require many more reverting operations.” But is this really the case? Since real-world data seems readily available on the Semantic Web, it should be possible to determine the validity of this supposition. I am not 100% sure whether I understand the outcomes reported in Table 2. The main outcome seems to be that the here presented approach has performance comparable to the existing approach of Tai et al., while preserving completeness. However, the improvement in performance of both approaches compared to the default case seems to be small/negligible. It is a bit unfortunate that Table 2, which is already somewhat difficult to read/interpret, is split across two pages. Since the resume heuristic requires an external store, I was expecting the size of this store to be quantified as part of the evaluation. After all, is the performance benefit in the mobile scenario not (partially) achieved by offloading the mobile process' storage onto another process / the triple store? If the latter is case (that is at least how I understood the setup), then it would be useful to also know the memory and CPU consumption of the triple store. Since the different between mobile and desktop environments is crucial to the main research question of this paper, it is unfortunate that "average performance times for PC and mobile are not directly comparable". (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The comparison to other approaches and implementation in the Related Work section is very good. However, I am missing a more high-level discussion/reflecting on the utility of the here presented research. I am prepared to accept use cases for mobile reasoners, but to what extent do they really have restricted access to computing resources (when compared to desktop systems)? (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) I did not find a link to the published Open Source code of the here presented implementation, but it is probably not difficult for the authors to add such a link. Since the data and rulesets are shared online, through a link that is given as a literature reference (which is a bit unusual), availability of the code would enable reproducing the here presented evaluation results. *After the rebuttal*: Since no link to an Open Source version of the code was shared, the reproducibility of the here presented work is more difficult than originally assessed. (OVERALL SCORE) This paper extends the existing RETE algorithm with a heuristic-driven approach that results in less memory and/or CPU consumption, while preserving the completeness of reasoning over an important OWL fragment. The reduction in hardware resources specifically applies to resource-constrained environments like mobile phones, where a process has limited access to RAM. Strong points: - The invention of two intelligent extensions that could positively impact performance of the RETE algorithm without sacrificing its completeness in monotonically increasing databases. - A good Related Work section, where alternative approaches are explained. - The paper is written very well and it nice to read. I could find almost no mistakes (not even minor ones). Weak points: - The utility of the paper currently relies on a property of the hardware market (desktop versus mobile) that may no longer exist within a couple of years. (The utility is at least not substantiated, e.g., by referring to discussions in the hardware and/or mobile research communities. - The evaluation does not deliver the numbers that seem to be most interesting: a direct comparison between desktop a mobile. - The evaluation results -- assuming I am interpreting them correctly -- seem to indicate that the proposed approach only marginally affects performance. Yet the text does not draw this conclusion, but sometimes refers to different (real-world) datasets for which the proposed approach would have / could have had different results. ---- Small grammar errors: - Ungrammatical sentence: “Afterwards, the tailored ruleset for future assertions.” - p6. “This [is] realized” - p10. ungrammatical phrase: “A reasoning cycle took place the initial dataset” - p11. “seem[s] negligible” --- *After rebuttal* Thanks to the reviewers for their rebuttal. Unfortunately, my worries about the main purpose of this paper have not been cleared. The authors say that “Although desktop and mobile systems are comparable in hardware specs, there are huge differences in performance.” The paper gives detailed information about the former (in Section 4.1.3), but never quantifies the -- apparently more important -- differences in performance. Based on the details in Section 4.1.3, it is not clear where the huge differences in performance originate from. They must be due to properties not documented in the paper (e.g., more aggressive power conservation strategies on mobile). If these properties are the most important differentiator and motivator for this research, then they should be documented (in addition to or instead of the less important hardware specs). Also: “Regardless of improvements in technology, the processing and memory capability of the latter will be smaller than those of the former.” But this is already not the case according to the information the paper gives us: a dual-core 2.9Ghz processor (desktop) does not have a larger processing capability than a quad-core 2.2Ghz processor (mobile), unless properties not documented in this paper are in place (see above).
Review 3 (by Brian Davis)
(RELEVANCE TO ESWC) The paper if extremely relevant to the Reasoning track under: optimisation, approximation, or evaluations of web reasoning algorithms, i.e., of procedures that take, as an input, ontologies (usually in RDF, RDFS, OWL, RIF) and test entailments or answer queries. (NOVELTY OF THE PROPOSED SOLUTION) The paper presents and advancement on the RETE pattern matching algorithm for implementing production rule systems which dynamically adapts RETE networks at run time and aims to reduce consumption of computational resources. The dynamic tailoring involves unlinking alpha memories from the network as well pausing alpha nodes such that they will no longer by matched by incoming tokens. The RETE tailor algorithm seeks to guarantee completeness in incremental scenarios linking and resuming at runtime when required. The delta appears to be that they extend which extends null left activation with a second heuristic - lower failed alpha nodes as well as the addition of a second operation - pausing - which avoids redundant match operations. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The algorithm is very well described and is clearly differentiated against the state of the art. The experiment setup is well described, including ontologies, rulesets used and benchmark configurations and metrics. (EVALUATION OF THE STATE-OF-THE-ART) The authors describe the state of the art quite thoroughly and assert their contribution relative to the related work. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The RETE Tailor algorithm is very well described and is clearly differentiated against the state of the art. The experiment setup is well described, including ontologies, rulesets used and benchmark configurations and metrics. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The experiments appear to be well documented referencing algorithms used in benchmark configuration, the datasets used in addition to link to online documentation See https://niche.cs.dal.ca/materials/rete-tailor/ (OVERALL SCORE) The paper presents and advancement on the RETE pattern matching algorithm for implementing production rule systems which dynamically adapts RETE networks at run time and aims to reduce consumption of computational resources. The dynamic tailoring involves unlinking alpha memories from the network as well pausing alpha nodes such that they will no longer by matched by incoming tokens. The RETE tailor algorithm seeks to guarantee completeness in incremental scenarios linking and resuming at runtime when required. The delta appears to be that they extend which extends null left activation with a second heuristic - lower failed alpha nodes as well as the addition of a second operation - pausing - which avoids redundant match operations. Strong Points Mostly very Well written Algorithm and experiments well described. Good knowledge or related work Weak Points Section 3 is somewhat heavy to read and it could have been structured, broken down more clearly succinctly as opposed to what appears to be a lengthy discussion in parts. Minor issue is the use of "Further" at lot in Section 4.2. I don't think that is and adverb rather an adjective. discussion in parts.
Metareview by Jeff Z. Pan
The paper addresses the topic of resource aware incremental reasoning by dynamically adapting the RETE networks at runtime to reduce resource consumption. The reviewers agree that this line of research is interesting and is related to the Semantic Web. However, they also identified important weaknesses which led to a reduced value of the current version of this work: (W1) An insufficient positioning/analysis of the contribution against related works, inluding the Delete and Rederive approaches, incremental reasoning approaches for OWL RL and OWL EL; (W2) Section 3 is somewhat heavy to read; (W3) The utility of the paper currently relies on a property of the hardware market (desktop versus mobile) that may no longer exist within a couple of years: some more thourough motivations (supported by evidence) should be provided. The paper was suggested to be a candidate cond-accept paper. Unfortunately, it didn't manage to stay above of the threshold in the final round of decision making. We hope the comments here are useful for a future submission of the work.