Paper 5 (Research track)

Price Sharing for Streaming Data- A Novel Approach for Funding RDF Stream Processing

Author(s): Tobias Grubenmann, Daniele Dell’Aglio, Abraham Bernstein, Dmitry Moor, Sven Seuken

Full text: preprint

Abstract: RDF Stream Processing (RSP) has proposed solutions to continuously query streams of RDF data. As a result, it is today possible to create complex networks of RSP engines to process streaming data in a distributed and continuous fashion. Indeed, some approaches even allow to distribute the computation across the web. But both producing high-quality data and providing compute power to process it costs money.

The usual approach to financing data on the Web of Data today is that either some sponsor subsidizes it or the consumers are charged. In the stream setting consumers could exploit synergies and, theoretically, share the access and processing fees, should their needs overlap. But what should be the monetary contribution of each consumer when they have varying valuations of the differing outcomes?

In this article, we propose a model for price sharing in the RDF Stream Processing setting. Based on the consumers’ outcome valuations and the pricing of the raw data streams, our algorithm computes utility-maximizing prices different consumers should contribute whilst ensuring that all the participants have no incentive of manipulating the system by providing misinformation about their value, budget, or requested data stream. We show that our algorithm is able to calculate such prices in a reasonable amount of time for up to one thousand simultaneous queries.

Keywords: RDF Streaming Processing; Price Sharing; Equal-Need Sharing

Decision: reject

Review 1 (by anonymous reviewer)

(RELEVANCE TO ESWC) The paper discusses a problem related to pricing in the context of rdf stream processing systems.
(NOVELTY OF THE PROPOSED SOLUTION) Even though the proposed solution is novel, it is rather straight forward with no significant technical challenges.
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The solution is correct, but could be extended to cover situations where condition 1 (section 4.4) is not satisfied.
(EVALUATION OF THE STATE-OF-THE-ART) The discussion is brief, but adequate. There is no evaluation of approaches other than the proposed one.
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The discussion is easy to follow. Though, some aspects of the approach (eg, when condition 1 is not satisfied) are left for future work.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The evaluation of the proposed approach is not complete, and does not help to deeply understand the behavior of the proposed algorithm. It is not clear how the time limit parameter for each one of the queries is set. It is also not clear how varying the value, budget and time limit parameters will affect the performance of the proposed approach. Even though experiments with synthetic datasets are useful, it would be nice to report results with real datasets (at least one), as well.
(OVERALL SCORE) This paper deals with the problem of price sharing for streaming rdf systems.
This is an interesting problem, but the paper does not make significant technical contributions. Presenting a solution that would also work for cases where the consumer gets value for partial streams (condition 1) would make the paper much stronger, since these cases appear often in practice. 
SP1. Interesting problem.
SP2. Simple and elegant solution.
SP3. Easy to follow text.
WP1. Solution does not cover the important case of partial streams.
WP2. Non-conclusive experimental evaluation.
Update: My most important point on not covering the case of partial streams has not been addressed, and this means that the contribution of the paper is limited.

Review 2 (by anonymous reviewer)

(RELEVANCE TO ESWC) The work presents a price sharing model for streaming data. Though, RDF stream processing is mentioned
in the title of the paper, there is nothing specific to RDF or semantic data. Thus, the paper is only
weekly relevant to ESWC.
(NOVELTY OF THE PROPOSED SOLUTION) Though, price sharing is not a novel idea in economics, the application of this approach to stream
processing in a cloud environment is an interesting idea. Though, some assumptions
of this paper are somehow questionable, the approach is basically promising.
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) An algorithm for cost sharing (or better allocation) is presented. However, this problem seems
rather to be an optimization problem with constraints (total budget). Furthermore, the algorithm
isn't well presented: e.g. S should be better the set of queries but not indexes and what happens
to queries if the budget is exhausted?
(EVALUATION OF THE STATE-OF-THE-ART) The authors discuss briefly some cost sharing approaches. This should be extended and consider
also cloud pricing models. In contrast, the discussion of RDF stream processing is actually not
really necessary for this paper.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The evaluation described in the paper is just a run of the price sharing algorithm without any 
connection to a real system or deployment. However, details of the synthetic data (apart from 
"randomly generated") are not given which limits the reproducibility.
(OVERALL SCORE) The paper presents a model and algorithm for price sharing in (cloud-based) stream processing.
The basic assumptions are that users have to pay for running operators and accessing stream sources
for a certain time. The main idea is that users specify their expected value (utility) and
the budget they are willing to spent. Based on this information a cost distribution is calculated.
In principle, this represents an interesting and promising idea. However, there are some 
strong points:
S1: Cost/price sharing is an interesting idea for cloud-based stream processing and the paper is
among the first to try to address this problem with an economic approach.
S2: The price model and the requirements are well described and motivated.
weak points:
W1: The assumptions underlying this approach are questionable. This holds both for the 
requirements (partially R2, R3) and the model (see details below and in the correctness section).
W2: Actually, the approach is based on a method described in [15]. Thus, the main contribution of
this work is to apply this model to a stream processing scenario. Furthermore, there is nothing
special related to stream processing (apart from longer running queries) or even RDF stream processing.
W3: The evaluation is limited to a runtime analysis of the algorithm based on synthetic data. 
Neither the hypothesis nor the goal of these experiments are really clear. I would argue that calculation
time is only a minor issue in this approach.
detailed comments:
* The price model doesn't reflect typical price models for cloud infrastructures: wouldn't it be more
realistic to pay for resource usage and not for operators? Second, in stream processing the arrival
rate of tuples is important which is not considered here. There is a big difference in running a query
where only 1 tuple per minute arrives vs. a query with thousands of tuples per millisecond.
* There are some concerns with the requirements (R3, R2) and the corresponding properties (sect. 4.4):
though, the motivation for R2 and R3 are clear, the explanations (what means "misinformation", getting
higher utility by outside computation) are unclear. Why is the algorithm "ignorant" to budget and value?
These parameters are used in Alg. 1?! For R3: what is wrong with a query where the platform is used
to prepare some data and perform the compute-intensive learning step outside? How could this be forbidden
by licenses? 
In summary, the work seems to be in a rather premature state. The authors should revise their assumptions.
Furthermore, the paper is probably better suited for a cloud conference than a semantic web venue.
After rebuttal:
I thank the authors for their response, but my concerns still hold. Therefore,I do not wish to change the scores in my review.

Review 3 (by Pieter Colpaert)

(RELEVANCE TO ESWC) The Semantic Web puts forward tools for decentralizing data tasks, yet this decentralization comes at a cost for which the business model is unclear. This paper puts forward a price sharing algorithms for funding RDF stream processing. It might be one of the highlights of ESWC2018.
(NOVELTY OF THE PROPOSED SOLUTION) Exciting work putting forward the first step into research for new data-driven business models. Exactly what the community needed.
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) Q1: Is it correct that I seem to miss information about how this algorithm was implemented, and on what machines the evaluation was executed?
This evaluation, which I deem less important than the intriguing ideas behind the paper, seem to lack implementation and design details.
(EVALUATION OF THE STATE-OF-THE-ART) The Linked Data Fragments framework, explained in [22], was also created from the observation the publishers would be paying too much for hosting a public Web API with too many functionalities, while a much simpler and more cost-efficient interface could be thought off, still offering user agents on the Web good access to the data. Furthermore, it would allow other intermediary agents to perform some actions for other parties. This paper could in fact automate this negotiation of costs.
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The paper is well written and has nice examples to illustrate the proposed approach.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) Q2: It’s clear that stream processing is a special resident in the Web of Data, as explained in your Introduction, yet can your approach also be applied to classic webby datasets?
The research datasets are not available. The algorithm is well described in a code snippet, yet the code itself is not available either.
Q3: Could this data become available?
(OVERALL SCORE) Based on three economic requirements, an algorithm to allocate runtimes and calculating the total prices is put forward. The requirements are proven to be implemented. In chapter 5, Evaluation, the processing time is evaluated to run the algorithm.
Strong points:
* Introducing economics in Semantic Web
* First step into overlooked issue of business model for Web of Data
* Algorithm is built on the basis of clear requirements
Weak points:
* No research data available
* Evaluation is unclear
* Not sure why there is a strong focus on stream processing
3 Questions see earlier.
Typo: Ontoloty in reference 6
After rebuttal:
I do not wish to change the scores on my review. I believe the contribution is indeed limited, as mentioned by other reviewers, but unique and novel in its kind and may lead to interesting discussions.

Review 4 (by anonymous reviewer)

(RELEVANCE TO ESWC) Very relevant
(NOVELTY OF THE PROPOSED SOLUTION) Interesting concept which has been tackled before
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The presentation is clear.  Evaluation provide a level of validation.
(EVALUATION OF THE STATE-OF-THE-ART) Quality of Service approach from stream processing would have a relevance here.  At the very least to show a GAP in their efforts.
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) An evaluation of the approach is provided with a discussion on the limitations of the approach.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The description of the experiment is OK.  No code or datasets released.
(OVERALL SCORE) This work tackles the problem of sharing access and processing fees for a set RDF stream processing graphs with overlapping operators and sources among multiple consumers. The paper proposes a model that uses a joint execution plan, which is based on the collected queries of all data consumers. It combines this query execution plan with the outcome valuations, runtime limits and willingness to pay values from all data consumers, the pricing of the raw data streams, and the pricing of the computations to determine a utility-maximizing payment distribution.  The approach is evaluated to determine the run-time performance.
Strong points:
- Equal need cost sharing: Their price sharing algorithm follows an equal need cost sharing method for operators/sources that overlap between multiple queries, where the assigned price share for the whole query can never be higher than the price of running the query in isolation.
- Maximized Utility: They allow the user to provide a utility value for running the query for a specific period of time, a runtime limit and a total willingness to pay value. The provided parameters are used by the price sharing algorithm to allocate runtime and charges payments as long as the assigned price is smaller or equal to the consumer's value thus maximizing the consumer’s utility.
- Illegitimate Utility Gain: Assigning price shares to queries is ignorant of any value, runtime limits, or budget for consumers, hence the consumers cannot benefit from manipulation by misinformation.
Weak points
- Multiple users per query: Their algorithm for price distribution calculates the price shares based on the different queries in the model assuming a single user per query which may not be the case in a real world scenario. However, they mention including a special kind of license attached to the streaming result of the query which prohibits the redistribution of the results outside the legal entity (a person or a company) which defeats the concept of contention ratio thus the number of users should be part of the price sharing model.
- User Negotiation: The price sharing algorithm charges payments as long as the assigned price is smaller or equal to the consumer's value which is beneficial for maximizing the consumer’s utility. However, if the calculated price share is greater than the consumer value, the algorithm drops the whole query and recalculates the price shares for the remaining queries. (1) Perhaps they should consider some sort of negotiation with the user before not providing the service at all (Future work), and (2) In a model with multiple users per query, the query cannot be dropped.
- Scalability: Their evaluation shows that the solution is limited in scalability when tens of thousands of queries are simultaneously involved which they acknowledge in their conclusion section.
- Related work:  QoS for Stream Processing should be covered.
Have you considered the effect of any common runtime stream processing optimizations e.g. operator replication, operator reordering, etc., operator failures or not meeting consumer QoS constraints on price sharing?
After rebuttal:
My score remains the same.  If accepted I would encourage the authors to acknowledge the limitations highlighted by the reviewers as future research opportunities.

Metareview by Intizar Ali

This paper presents a price sharing model for streaming data which considers cost and resource sharing among multiple data consumers. Pricing models are well addressed for cloud-based and services domain solutions. However, the proposed study is one of the initial efforts to introduce a pricing model for Web of data. A well-established pricing model will be a key to creating a self-sustaining economy for Web of data and there are no doubts that this study can bring very interesting discussion at the ESWC. However, the strong reservations are regarding the novelty of the approach with limited contribution beyond state of the art. Also, one of the reviewers rightly raised a few concerns related to assumptions and model itself, which needs a careful revision.  The paper in its current state has limited contribution to be accepted as a full research paper, but a careful revision of the assumptions, pricing model, and its evaluation can certainly have a good impact for the sustainability of Web of data. We strongly encourage authors to submit a revised version of their paper in the upcoming editions of semantic web related conferences.

Paper 227 (Research track)

Dynamic Tailoring of RETE Networks in Incremental Scenarios

Author(s): William Van Woensel, Syed Sibte Raza Abidi

Full text: preprint

Abstract: Decision support systems, with production rule systems at their core, have an opportunity to leverage the embedded semantics of semantic, ontology-based data to improve decision support accuracy. Advances in mobile hardware are enabling these rule-based systems to be deployed on mobile, ubiquitous platforms. By deploying reasoning processes locally, time-sensitive tasks are no longer influenced by network conditions, less bandwidth is wasted, and less re-mote (costly) resources are needed. Despite hardware advances however, recent benchmarks found that, when directly re-using existing (PC- or server-based) technologies, the scalability of reasoning on mobile platforms is greatly limited. To realize efficient semantic reasoning on resource-constrained platforms, utilizing rule-based axiomatizations of ontology semantics (e.g., OWL 2 RL), which are known to trade expressivity for scalability, is a useful first step. Furthermore, the highly dynamic nature of mobile and ubiquitous settings, where data is typically encountered on-the-fly, requires special consideration. We pro-pose a tailored version of the RETE algorithm, the mainstay algorithm for production rule systems. This algorithm dynamically adapts RETE networks based on the evolving relevance of rules, with the goal of reducing their resource consumption. We perform an evaluation of semantic reasoning using our custom algorithm and an OWL2 RL ruleset, both on the PC and mobile platform.

Keywords: RETE; OWL2 RL; rule-based reasoning; OWL reasoning; reasoning optimization

Decision: reject

Review 1 (by anonymous reviewer)

(RELEVANCE TO ESWC) The article depicts optimizations of the Rete algorithm for forward inference in rule based systems, and the approach is empirically evaluated over the OWL 2 RL Benchmark Corpus.
Both seem relevant to ESWC.
(NOVELTY OF THE PROPOSED SOLUTION) According to the authors themselves, the optimizations are similar to [Doorenbos 95] (and to some extent to [Miranker 87]), and a detailed comparison is missing to appreciate the novelty of the approach.
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) Completeness is claimed to be preserved, but no justification is given for the case of fact retraction (see the full review below).
(EVALUATION OF THE STATE-OF-THE-ART) Pointers to the literature are relevant, but the analysis of the state-of-the-art needs to be more detailed.
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The optimizations are very clearly described and illustrated.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) Thanks to a precise description of the algorithm, and to the publication of the ruleset and data, the evaluation should be reproducible.
The main weakness of the evaluation is the chosen baselines (see the full review below).
(OVERALL SCORE) The article proposes a series of optimizations of the Rete algorithm, one of the most popular algorithms for forward inference in rule based systems.
The optimizations are meant to target resource-constraint platforms (such as mobile phones), and incremental reasoning, i.e. scenarios where additional facts (or fact retraction) may be dynamically fed to the inference engine. 
The optimizations prevents attempting some joins which will necessarily fail, because the memory of one of the join arguments is empty.
This is performed by "unlinking" alpha nodes from (the Rete graph of) certain rules, or further "pausing" them if they have been unlinked from all rules in which they appear.
A triggering mechanism allows relinking/reactivating alpha nodes once the join operations have a chance to succeed again.
As a side effect, if a paused alpha node never resumes execution, unnecessary matches will be avoided.
A detailed description of the whole unlinking/pausing/resuming mechanism is provided, including an algorithm, called Rete_tailor.   
The strategy is then empirically evaluated over the OWL 2 RL Benchmark Corpus, together with a partially handcrafted ruleset, available online, in addition to helpful links and samples.
The two baselines are a naive implementation of the Rete algorithm, and a simple a priori tailoring strategy, which consists in running the naive Rete over a fragment of the dataset, discard unused rules, and then perform (incomplete) inference over the remainder of the dataset. 
The article is particularly well written, clear, and largely self-contained.
The more technical aspects are introduced progressively, and illustrated with examples.
A precise description of the algorithm, together with the publication of the dataset, make the experiment reproducible.
Possible limitations (such as the need for storing tokens in memory for paused alpha nodes) are explicitly pointed out, as well as similarities to optimizations proposed in the literature.
My main reservations are:
1) An insufficient positioning/analysis of the contribution.
Rete-based forward inference is a relatively mature technology.
A number of improvements of the naive algorithm have already been proposed/evaluated, and cost-models have even been designed (see for instance Wright et al. 2013 below).
In this context, a finer-grained (theoretical and/or empirical) comparison of Rete_tailor with existing optimizations is expected.
In particular, the parallels drawn in Section 5 are too high-level to appreciate the novelty of the approach.
It is argued to be well-suited for resource-bounded scenarios and incremental reasoning, but the article does not explain why previous optimizations of the Rete algorithm may not.
2) The baselines chosen for the empirical evaluation are arguably not appropriate.
Comparing Rete_tailor to the naive Rete algorithm is not very informative, as the latter has long been replaced in practice by optimized versions.
As for the other baseline, it does not guarantee completeness.
So the performance comparison (number of join/match actions) is not only unfair to Rete_tailor, but also somewhat irrelevant, as both systems may not yield the same output.
Instead, one would expect a comparison with some of the optimizations proposed for instance in [Doorenbos 95], with which Rete_tailor has a lot in common, according to the authors.
Other concerns are the following:
3) Rete-based fact retraction is mentioned several times, but never (even briefly) described.
Retracting a fact is arguably a key aspect of incremental reasoning, which is the application scenario targeted by the authors.
To my (limited) knowledge, Rete-based retraction consists in applying the inference mechanism in a reverted fashion.
But it is unclear how the proposed optimizations behave in scenarios where facts are both added and retracted.
In particular, can completeness be affected by the interaction between asserting/retracting a fact and pausing/resuming alpha nodes?
The answer may be obvious, but a short justification would help.
4) Although OWL 2 RL can express disjointness, the assumption seems to be made that the input dataset (data + ruleset) is consistent.
Maybe this could be explicitly said. 
### Questions ###
- Page 5, Section 3.2, definition of h.1: "and \alpha memories in case i <= 2; Fig. 1".
I could not find an "i" in Figure 1.
Is it the index of the alpha node?
- Section 5: How does Rete_tailor compare to [Özacar et al. 2007]?
### Suggestions ###
- Page 7, Section 4.1.4 7, T.iii: "to avoid redundant join and match operation": "redundant" is possibly misleading.
Intuitively, redundancy means matching twice the same token (at different times), or joining twice over the same pair of tokens.
I suggest something more explicit, like "avoid necessarily failing join/match attempts".
- Page 7, Section 3.4: "Code 1" -> "Algorithm 1"
- Page 8, second paragraph: "previously deemed redundant". Same remark as above for "redundant".
- Section 5 helps understanding the intended application context, as well as the specificity of the approach, in particular to what extent the proposed work differs from [Doorenbos 95].
Therefore it may appear earlier in the article.
- Bibliography: at least one of the articles of Forgy (author or the original Rete algorithm) should be mentioned (e.g the one referenced below).
### Typos ###
- Figure 1: t3 is used instead of t2
- Page 4, 2nd paragraph: "Afterwards, the tailored". Part of the sentence is missing.
- Page 7, 3rd paragraph: "is longer empty" -> "is no longer empty"
- Page 7, Section 4.1.4 7: "took place the original dataset". Missing preposition.
- Page 7, case "t_x > t_y": "to beta node \beta_1(network1 > network2)".
Seems like what is meant here is "(network1 > network3)", as network2 is never reached in the case t_x > t_y.
- Page 12: avoid splitting Table 2 over 2 pages.
### References ###
Forgy, Charles L. "Rete: A fast algorithm for the many pattern/many object pattern match problem." Readings in Artificial Intelligence and Databases. 1988. 547-559.
Wright, Ian, and James AR Marshall. "The execution kernel of RC++: RETE*, a faster RETE with TREAT as a special case." Int. J. Intell. Games & Simulation 2.1 (2003): 36-48.
After rebuttal
Thanks to the authors for their response letter.
Please find below my comments about it.
1) "one requires a comparison to the baseline system": fair enough. A comparison to the standard Rete seems indeed relevant. My point was that it is not sufficient.
2) "An explicit comparison with the optimizations from Doorenbos [18] seems unsuitable, since most of these are included in our work": I have to disagree with this statement. I was precisely expecting a comparison between optimizations from Doorenbos [18] (at least the ones implemented by the authors) and the Rete_tailor algorithm. Currently, the experiments do not specifically evaluate the new optimizations proposed in the article (namely pausing, and the lower-failed alpha node heuristic).
3) I also still disagree with the relevance of the other baseline (a priori tailoring).
The evaluation compares performances (token matches, join attempts, reasoning time) of the different algorithms.
Therefore they should produce the same output.
To take an analogy, imagine comparing query evaluation times of two SQL engines.
This is arguably pointless if one of them returns incorrect or incomplete answers.
Put another way, one can only expect a qualitative comparison of Rete_tailor and a priori tailoring (number of inferred triples, ...).
4) "other approaches with the same goal" and "the most relevant work in the field":
I may be missing something obvious here (apologies if I am).
Which "field" exactly are we talking about?
If I understand correctly, the field is "Rete-based forward chaining in incremental scenarios", as opposed to "Rete-based forward chaining in non-incremental scenarios".
And according to the authors, the most relevant work in the former field is Tai [5].
But what exactly makes the incremental case different?
Section 3.3 paragraph 2 is not sufficient for me to understand it.
More exactly:
- Let us first assume that no fact is retracted. Which optimizations are better-suited to the incremental case, and why (is this due to additional parallelization opportunities in the non-incremental case)?
- Now let us assume that facts can be retracted. Are there additional differences between incremental and non-incremental?
And if so, why is the literature about deletion (in the non-incremental case) not relevant?

Review 2 (by Wouter Beek)

(RELEVANCE TO ESWC) While reasoning is an important use case on the Semantic Web, I am not sure whether the adaptation of reasoners for resource-constrained environments is specifically relevant.  The desktop and mobile systems that are used in the evaluation are already almost comparable in hardware specs.  There may only be a very brief window (two to five years) in which mobile and desktop devices have access to different computational resources.  It certainly would have helped to have at least some discussion on the mid-to long-term utility of this work in the paper.  I can imagine some additional long-term applications in an IoT context, but the paper does not mention these at all.
(NOVELTY OF THE PROPOSED SOLUTION) The proposed approach does include non-trivial extensions of an existing approach and of existing algorithms.
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The approach is explicitly preserving completeness of entailment results (as opposed to a specific existing approach), and completeness is also evaluated WRT to multiple ontologies.
(EVALUATION OF THE STATE-OF-THE-ART) Does Table 1 show the total number of unlinking/relinking operations over all ontologies?  Is this not a very small number / does this even matter when compared to the total number of operations?  In the text the general applicability is explained to be larger than these numbers suggest: “it is likely that [real-world scenarios] will require many more reverting operations.”  But is this really the case?  Since real-world data seems readily available on the Semantic Web, it should be possible to determine the validity of this supposition.
I am not 100% sure whether I understand the outcomes reported in Table 2.  The main outcome seems to be that the here presented approach has performance comparable to the existing approach of Tai et al., while preserving completeness.  However, the improvement in performance of both approaches compared to the default case seems to be small/negligible.  It is a bit unfortunate that Table 2, which is already somewhat difficult to read/interpret, is split across two pages.
Since the resume heuristic requires an external store, I was expecting the size of this store to be quantified as part of the evaluation.  After all, is the performance benefit in the mobile scenario not (partially) achieved by offloading the mobile process' storage onto another process / the triple store?  If the latter is case (that is at least how I understood the setup), then it would be useful to also know the memory and CPU consumption of the triple store.
Since the different between mobile and desktop environments is crucial to the main research question of this paper, it is unfortunate that "average performance times for PC and mobile are not directly comparable".
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The comparison to other approaches and implementation in the Related Work section is very good.  However, I am missing a more high-level discussion/reflecting on the utility of the here presented research.  I am prepared to accept use cases for mobile reasoners, but to what extent do they really have restricted access to computing resources (when compared to desktop systems)?
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) I did not find a link to the published Open Source code of the here presented implementation, but it is probably not difficult for the authors to add such a link.  Since the data and rulesets are shared online, through a link that is given as a literature reference (which is a bit unusual), availability of the code would enable reproducing the here presented evaluation results.
*After the rebuttal*: Since no link to an Open Source version of the code was shared, the reproducibility of the here presented work is more difficult than originally assessed.
(OVERALL SCORE) This paper extends the existing RETE algorithm with a heuristic-driven approach that results in less memory and/or CPU consumption, while preserving the completeness of reasoning over an important OWL fragment.  The reduction in hardware resources specifically applies to resource-constrained environments like mobile phones, where a process has limited access to RAM.
Strong points:
- The invention of two intelligent extensions that could positively impact performance of the RETE algorithm without sacrificing its completeness in monotonically increasing databases.
- A good Related Work section, where alternative approaches are explained.
- The paper is written very well and it nice to read.  I could find almost no mistakes (not even minor ones).
Weak points:
- The utility of the paper currently relies on a property of the hardware market (desktop versus mobile) that may no longer exist within a couple of years.  (The utility is at least not substantiated, e.g., by referring to discussions in the hardware and/or mobile research communities.
- The evaluation does not deliver the numbers that seem to be most interesting: a direct comparison between desktop a mobile.
- The evaluation results -- assuming I am interpreting them correctly -- seem to indicate that the proposed approach only marginally affects performance.  Yet the text does not draw this conclusion, but sometimes refers to different (real-world) datasets for which the proposed approach would have / could have had different results.
Small grammar errors:
- Ungrammatical sentence: “Afterwards, the tailored ruleset for future
- p6. “This [is] realized”
- p10. ungrammatical phrase: “A reasoning cycle took place the initial
- p11. “seem[s] negligible”
*After rebuttal*
Thanks to the reviewers for their rebuttal.  Unfortunately, my worries
about the main purpose of this paper have not been cleared.  The
authors say that “Although desktop and mobile systems are comparable
in hardware specs, there are huge differences in performance.”  The
paper gives detailed information about the former (in Section 4.1.3),
but never quantifies the -- apparently more important -- differences
in performance.  Based on the details in Section 4.1.3, it is not
clear where the huge differences in performance originate from.  They
must be due to properties not documented in the paper (e.g., more
aggressive power conservation strategies on mobile).  If these
properties are the most important differentiator and motivator for
this research, then they should be documented (in addition to or
instead of the less important hardware specs).
Also: “Regardless of improvements in technology, the processing and
memory capability of the latter will be smaller than those of the
former.”  But this is already not the case according to the
information the paper gives us: a dual-core 2.9Ghz processor (desktop)
does not have a larger processing capability than a quad-core 2.2Ghz
processor (mobile), unless properties not documented in this paper are
in place (see above).

Review 3 (by Brian Davis)

(RELEVANCE TO ESWC) The paper if extremely relevant to the Reasoning track under:
optimisation, approximation, or evaluations of web reasoning algorithms,  i.e., of procedures that take, as an input,  ontologies (usually in RDF, RDFS, OWL, RIF) and test entailments or answer queries.
(NOVELTY OF THE PROPOSED SOLUTION) The paper presents and advancement on the RETE pattern matching algorithm for implementing production rule systems which dynamically adapts RETE networks at run time and aims to reduce consumption of computational resources.   The dynamic tailoring involves unlinking alpha  memories from the network as well pausing alpha nodes such that they will no longer by matched by incoming tokens. The RETE tailor algorithm seeks to guarantee completeness in incremental scenarios linking and resuming at runtime when required.  The delta appears to be that they extend  which extends null left activation with a second heuristic -  lower failed alpha nodes  as well as the addition of a second operation - pausing  - which avoids redundant match operations.
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The algorithm is very well described and is clearly differentiated against the state of the art.  The experiment setup is well described, including ontologies, rulesets used and benchmark configurations and metrics.
(EVALUATION OF THE STATE-OF-THE-ART) The authors describe the state of the art quite thoroughly and assert their  contribution relative to the related work.
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The RETE Tailor algorithm is very well described and is clearly differentiated against the state of the art.  The experiment setup is well described, including ontologies, rulesets used and benchmark configurations and metrics.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The experiments appear to be well documented  referencing algorithms used in benchmark configuration, the datasets used in addition to link to online documentation See
(OVERALL SCORE) The paper presents and advancement on the RETE pattern matching algorithm for implementing production rule systems which dynamically adapts RETE networks at run time and aims to reduce consumption of computational resources.   The dynamic tailoring involves unlinking alpha  memories from the network as well pausing alpha nodes such that they will no longer by matched by incoming tokens. The RETE tailor algorithm seeks to guarantee completeness in incremental scenarios linking and resuming at runtime when required.  The delta appears to be that they extend  which extends null left activation with a second heuristic -  lower failed alpha nodes  as well as the addition of a second operation - pausing  - which avoids redundant match operations.
Strong Points
Mostly very Well written
Algorithm and experiments well described.
Good knowledge or related work
Weak Points
Section 3 is somewhat heavy to read and it could have been structured, broken down more clearly  succinctly as opposed to what appears to be a lengthy discussion in parts.
Minor issue is the use of "Further" at lot in Section 4.2.  I don't think that is and adverb rather an adjective.
discussion in parts.

Metareview by Jeff Z. Pan

The paper addresses the topic of resource aware incremental reasoning by dynamically adapting the RETE networks at runtime to reduce resource consumption. The reviewers agree that this line of research is interesting and is related to the Semantic Web. However, they also identified important weaknesses which led to a reduced value of the current version of this work:
(W1) An insufficient positioning/analysis of the contribution against related works, inluding the Delete and Rederive approaches, incremental reasoning approaches for OWL RL and OWL EL;
(W2) Section 3 is somewhat heavy to read;
(W3) The utility of the paper currently relies on a property of the hardware market (desktop versus mobile) that may no longer exist within a couple of years: some more thourough motivations (supported by evidence) should be provided.
The paper was suggested to be a candidate cond-accept paper. Unfortunately, it didn't manage to stay above of the threshold in the final round of decision making. We hope the comments here are useful for a future submission of the work.

Paper 226 (Resources track)

Build a corpus of scientific articles with semantic representation

Author(s): Jean-Claude Moissinac

Full text: preprint

Abstract: As part of the SemBib project, we undertook a semantic representation of the scientific production of Telecom Paristech. Beyond the internal objectives, this enriched corpus is a source of experimentation and a teaching resource. This work is based on the use of text mining methods to build graphs of knowledge, and then on the production of analyzes from these graphs. The main proposal is the disjoint graph production methodology, with clearly identified roles, to allow for differentiated uses, and in particular the comparison between graph production and exploitation methods. This article is above all a methodological proposition for the organization of semantic representation of publications, relying on methods of text mining. The proposed method facilitates progressive enrichment approaches to representations with evaluation possibilities at each step.

Keywords: semantic publishing; publication; Linked Data; SPARQL

Decision: reject

Review 1 (by Giuseppe Rizzo)

This paper addresses a need of turning the ParisTech library into a richer archive of data and relations among authors and papers and into a better indexed and thus searchable archive. Such a pain can be extended to any library of universities and organizations, thus this objective is certainly of help for the community and society as whole. 
The approach presented aims to apply basic text mining notions (identified in the paper as TF-IDF -- only) to the generation of a knowledge base of scientific concepts and publications (scientific papers), where concepts and publications have explicit relations. Thus, the approach presented in this paper is of interest for the Resources track as defined in the requirements of the call for papers. 
However, the paper lacks from the necessary formalism and depth in describing core concepts (see below) and in presenting the knowledge base. 
The paper would also benefit from proof-readings as numerous paragraphs are hard to be understood (I omit to report the long list of typos but I recommend to check plurals such as "different strategy", and the numerous sentences ending with 3-dots which left me doubting about the notion acquired by the writers and the notion delivered to the readers).
- idea and impact to scientists and society
- availability of the resource (with a big question mark see below)
- the portal is in French, this introduces an access barrier to the reuse of the data instances indexed by the portal. Then, from the portal I cannot find instructions on how to access the instances. There is a "technical section" in , but it remains unclear how to traverse the graph. Despite the reported technicalities, the easiest thing of allowing the access and re-use of data instances is lacking
- the approach is half-way of being a research paper and half-way of being a resource paper because it lacks from describing with the necessary depth the text mining approaches for the creation of the KB (a few mentions here and here such as TF-IDF) and the KB as data model and instances
- the value chain of the approach is based on a traditional process of data acquisition->corpus creation->data publishing. However, Sec 2.5 "Gather docs and perform basic treatments" (I reckon you meant basic processing) that should describe the first 2 stages of the value chain do not answer the question "how this has been done". Then, the authors mentioned that the ontology used for modeling data points is based on a set of SPAR ontologies. Which ones? In principle I can indeed agree on the choice, however the paper should list the rationale and eventually a comparison why those ontologies in such a context (a citation could be already enough)
- the authors mentioned that the approach grounded on the decoupling of concepts vs publications and this part has been underlined as important for the creation of a "ground truth". Which ground truth by the way? The choice of decoupling, assuming crucial, needs to be described comprehensively if you want to give such an importance
- the other stage of the value chain is the publishing that should ease the exploration. This part cannot be assessed (see above). It is introduced that the authors did undergo into a technology choice for deploying a SPARQL endpoint. This technology choice is just introduced, but not elaborated, lacking from giving the necessary inputs to scientists to understand your motivations. Imagine: if I want to replicate or extend such an approach, how could I know which technology is more efficient, better performing, ... (and we can list here many more evaluation dimensions)? 
- a wrap up section on the achievements based by sembib is presented. This section should be expanded in order to show the actual benefit of the approach, ideally using a proper scientific validation (hypothesis formulation, experimental set up definition, KPI derivation and quantification)
======= AFTER REBUTTAL ========
Thanks for answering. However, my points concerning the how (3rd, 4th, 5th) haven't been answered. I have to confirm my previous mark as I don't have the necessary information to judge the relevancy of this approach and thus the resource that is generated.

Review 2 (by Cristina Sarasua)

This paper presents a data set containing RDF metadata about scientific articles by Telecom ParisTech. The author reused state-of-the-art ontologies from the bibliographic domain (e.g. Fabio and dcterms), used state-ot-the-art tools to mine the text of publications to obtain their keywords, and enabled a SPARQL endpoint. The author claims in the abstract that the contribution of the paper is the organization of publication metadata in different graphs. 
My main concern with this submission is that it provides a very limited contribution and it ignores the related work done in the field of scientific RDF data publishing. Even if the submission is a Resources submission, a data set submission in this track should somehow contribute to the state of the art. One way to do that is to provide data that can be used for benchmarking methods (and hence, includes an exhaustive set of test cases manually reviewed by experts). From the description provided in the text and the content visible from the SPARQL endpoint, it does not look like this data set provides a solid benchmarking data set, and it is not a data set that is supposed to help validating new methods either. While having a new data set about publications is in general appreciated, the submission does not provide a novel research contribution, and it looks like an engineering effort. Moreover, the submission provides a data set that ignores some Semantic Web publishing practices, and it lacks proper documentation (see more details below).
I recommend the author to work in the direction of new methods to improve the annotation and the interlinking of scientific publications and submit a novel contribution to a Semantic Web venue in the future. 
**Positive aspects**
- It adds RDF data into the Web of Data.
- The author reused some existing ontologies, such as Fabio and dcterms. 
- The data set contains a concept graph with the papers’ keywords.
** Negative aspects**
- The contribution of this paper does not have novelty in terms of research. Many data sets with scientific articles have been published in the past (see the publications colour in the LOD diagram, and the submission does not refer to any of the existing data sets. 
- The submission lacks specificity. For example, when the author describes the representation of documents, the author says that “a semantic representation of the metadata has been realised by relying on SPAR ontologies (bibo, cito) and ontologies commonly used for documents (Dublin Core, schema, foaf …);”, without providing a comprehensive description of the exact parts of the ontologies used, nor providing examples of representative resources. A graphical representation or a Turtle/N3 description of the RDF resources that the data set contains would help the reader have a clear understanding of the exact data published. Moreover, the author mentioned that data is exposed via a SPARQL endpoint, but the text does not provide the URL of the endpoint. Guessing that it would end in “*/sparql” I found it on the Web. However, the author should be aware of the fact that a Resources submission should indicate such details (see the call for papers at ESWC 2018 and last year’s analogous call and program in ISWC ).
- The work presented seems to be incomplete, as the text mentions that they started to evaluate methods for learning word vectors, and they are working on enrichment.
- The submission does not fulfil the requirements of a high quality submission in the Resources track. 
** As mentioned before, the data set does not break new ground, it does not advance the state of the art, and it does not look like it will have impact in improving the adoption of Semantic Web technologies — the publications domain is quite well researched, and (despite the challenge of convincing specific organisations to do so), more and more librarians and archivists are involved in standard Semantic Web metadata publishing. 
** The Web site of the project is exclusively in French. Considering that it will have an international scientific audience, English would be needed.
** The author says in the conclusions and outlook section that “An important next step is the RDF documentation of this dates with DCAT, and then the publication of this data”. I assume the data is published (since it is queryable via the SPARQL endpoint), and again, the DCAT description is something the author should have provided as for the submission (see also ). The author should make the data FAIR (see also
** It is very difficult to assess whether the resource will be useful for a wider audience, since the author does not indicate the extent to which the data set intersects with other publication data sets.
** The technical quality of the resource should be revised, applying best practices in Semantic Web publishing. See suggestions below to (i) interlink the instances to other data sets and (ii) revise the classification information for e.g. ResearchPapers
** The author does not provide enough descriptive statistics about the data set in the submission. For example what is the AVG and STDEV of the number of keywords per publication in the data set? The RDF description of the paper I looked up, for example, does not have any keyword in the RDF response obtained in the SPARQL endpoint, while the publication does contain keywords ( It is unclear if that is due to multiple named graphs or because the data is not there. It would also be advisable to showcase queries that use the various graphs mentioned in the text.
** There is not information about the license of the data — or at least it is not mentioned in the submission, nor in the main Web site ( or
** The author does not provide any entry of the data set in Zenodo, GitHub, DataHub, Figshare etc.
**There is no sustainability plan specified. The author barely indicates technical tasks that they are currently working on (section 3.2) but does not mention the way the maintenance of the data set will be carried out. 
- The semantic forms developed, to display the resource descriptions in a more human-friendly way (e.g. are not a novel contribution either. There have been plenty of projects developing user interfaces for RDF data (e.g. all the work around semantic wikis and semantic portals). 
- The conclusions indicate that the paper describes "a general methodological approach for testing different approaches to the description of bibliographic entities by association with concepts” while the content of the paper mainly refers to the engineering process of preparing metadata about bibliographic entries, and the text lacks indeed details about processes.
- The quality of the writing should be improved: 
(i) The author should revise their text with an English native speaker. 
(ii) In some parts, the text does not look rigorous enough for a scientific publication (e.g. when the author lists the achievements based on SemBib, when the author mentions “the idea is to use external graphs”, in sentences like “we can try different strategy to associate..” and “Important progress remains to be made to improve the exploitation of semantic graphs. their adoption, especially on the Web, is still very limited compared to the potential of these representations, especially driven by thematic operations- music, events ... - carried by major search engines (cf”) 
(iii). The text contains explanations about Semantic Web standards or principles that are unnecessary for a Semantic Web audience and consume valuable space. 
** I recommend the author to translate the explanations offered in their Web site ( from French to English, to provide the information in both languages. 
** I recommend the author to link the resources in this data set to other external data sets. For example, the author could have also already linked the instances via owl:sameAs links to resources in other existing bibliographic data sets such as DBLP. If this (SemBib) data set contains keywords that other data sets do not have, then this new data set would be enriching the description of the article.
** I went to the SPARQL endpoint
** I executed the following query to look up the description of a ResearchPaper: describe 




Theory of masking with codewords in hardware: low-weight $d$th-order correlation-immune Boolean functions

** And I saw that the description of the paper does not link to any other external representation of the paper
** While DBLP ( contains the representation of that paper ( as well as the representation of the main author (
- I encourage the author to identify the intersection and the differences that this data set has with other bibliographic data sets, in terms of concepts, resources (people, publications, topics), and facts (i.e. statements). I would also recommend to try to enrich the data set with other information, not necessarily from the bibliographic domain.  
**Other things**
- Why is the property “entrytype” used as a datatype property in the description of a ResearchPaper, if this way it helps little to classify the resource? Moreover, the description already contains an rdf:type statement. What value does “entrytype” add to it?
In the resource description shown above”> there are two statements as follows:

- Section 2.3 mentions “Our hypothesis is that advances in the semantic web are able to give us new ways to efficiently exploit the data we collect, in order to provide research and analysis functions. “ That very much depends on the specific definition of the objective, which is not clearly specified. The hypothesis is not tested in any way in the paper. 
- In page 3 the submission says “Large warehouses of bibliographic data exist elsewhere. Unfortunately, they give a very truncated view of our production, in particular because they are not able to resolve the changes in the name of our institution and their usual variants. Moreover these bases do not have information on internal structures of research: projects, departments and groups of research …” . Has the author considered that usually information about projects, institutions etc. are present in separate data sets? That is why it is so important to link the data to other data sets (for the sake of information coverage, and data maintenance).
- The author says that publications were crawled but the text does not discuss matters such as the copyright of publications. How was this issue handled?
"Publication venue" sounds better than the term used ("publication channel”), when one refers to conferences and journals.
- When one tries to execute the query in pages 8-9, if one includes the PREFIX statements the endpoint gives an ARC2 error, and without the PREFIX statements, as the author indicated the query gives a “this site cannot be reached”. See also
- The plot shown in Figure 1 does not provide a very useful insight to the reader. The author should think of communicating the information differently. Perhaps the authors could show a table, including information about the shared keywords, clustering the information by conference or topic, and give statistics based on the number of keywords shared.
** After rebuttal **
Thank you for replying to our questions and comments. 
I would like to add that the comment about "ignoring other publication data sets" didn't refer to the fact of citing them in the text, but rather to the process of integrating the data. 
Finding different ways to reuse existing technology is a valid approach to explore a field and come up with new solutions to new problems. However, I still think that this work requires more novel components and further elaboration to be accepted.

Review 3 (by Jodi Schneider)

Thanks for your many comments and especially for trying to make your materials more accessible to non-French speakers!
This paper describes the process of producing a semantic representation of publications in the Telecom Paristech full-text repository. While semantic metadata for publications has been extensively curated, work on full-text of papers is novel. Even as an expert in this area I find the paper useful in going further than the published literature on details of corpus ennrichment.
The fit with the resource track is not completely clear; I would think of it as an industry/in-use application. It would be even better to reframe the text as a tutorial with select data released (e.g. presumably the non-copyrighted data).
Some suggestions:
- avoid dates in the form 1/12/2017 in international publications (whether this is Jan 12 or Dec 1 will depend on the location of the reader. YYYY-MM-DD (ISO 8601) is always safe. (It appears from the end "In not differently specified, all links were last followed on January 12, 2018." -- which should be "if not..." -- that this is actually Jan 12 2018)
- remove space before footnotes so that they follow the letters (not follow the space)
- carefully proofread (e.g. "citet Larsen: 2010: Scientometrics: 20700371"), ideally have a native English speaker proofread (e.g. in English there is no space between a number and the % mark).
- check references for consistency and usefulness. For instance, I don't know what ToTh is. Technical reports should ideally have URLs given to make them easier to find.
- Figure 1 needs improvement. It's not clear what the colors mean. Do you have any comment to make on the disconnectedness of keywords?

Review 4 (by anonymous reviewer)

The paper presents a collection of scientific articles with semantic representation. It follows a graph-based approach for capturing the metadata. The overall goal would be to enable to connection with further distributed repositories of scientific publications.
At the currents state of the repository, I am not sure that it is ready to be released to the community. As the authors point out, there are already quite a few large and established repositories for academic articles. A new one would have to have very definitive and clear advantages. The main concerns are: 1) why not use a standardised format for the meta data? This would also include linking to existing graphs of authors for example (or reusing existing ontologies) 2) The first paragraph of 2.5 does not really clearly state what the state of the repository is? Is it an anvil pilot with only a few articles, are there any further repositories likes, is it in a final and stable version? 
Overall, the work is interesting but the potential impact for the community is not absolutely clear.

Paper 225 (Research track)

Navigation in Large Ontologies

Author(s): Aamna Qamar

Full text: preprint

Abstract: The ever-growing data on the Web has given rise to ontologies reaching the size of 100,000s of concept names. The tools available to manipulate and navigate in ontologies are not competent enough to handle such large ontologies. This paper discusses a new tool prototype that allows the ontology engineers to easily perform navigation and exploration tasks in large ontologies for their sense-making. The navigation tasks, like ontology summary, focusing and zooming can be achieved through search functionality that can be run with or without reasoner. For filtering and extracting modules, the syntactic-locality modularization tools are also incorporated in the prototype. The evaluations presented in this paper demonstrate the significantly positive results obtained by experimentation with real-world large ontologies on this tool.

Keywords: Ontologies; key concepts extraction; ontology engineering; Web Ontology Language (OWL); modularization; ontology navigation

Decision: reject

Review 1 (by anonymous reviewer)

(OVERALL SCORE) This rather short paper reports on an approach to enable navigation in large ontologies.
It starts out with some general remarks on ontologies, ontology languages and tools (which I would assume to be known to the community), then discusses Key Concept Extraction as a useful way of providing the user with a coarse overview of a big ontology. It then describes a tool developed by the author by recalling the creation history (maybe not so important for the reader) and describing the architecture. Thereafter the tool is compared against other ontology management tool in terms of loading time. It follows the description of some conducted case studies.
This contribution would make a nice software demo at ESWC or maybe a workshop contribution. It doesn't have sufficient substance for a conference paper as it remains unclear what new scientific insights it provides. The description of the functionality of the tool is somewhat vague and in order to clearly demonstrate its added value, some sort of user study would be necessary.

Review 2 (by anonymous reviewer)

(RELEVANCE TO ESWC) The paper deals with the problem of sense-making in large ontologies, and is, thus, highly relevant to ESWC.
(NOVELTY OF THE PROPOSED SOLUTION) I see very little novelty in the paper. Or at least, this novelty is not obvious from the text. There are tons of visualisation and summarisation tools, and the only important novelty of the proposed approach seems to be the gradual loading of an ontology (which allows it to handle large ontologies).
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) A more elaborate description of the tool and its functionalities should exist.
(EVALUATION OF THE STATE-OF-THE-ART) The study of existing relevant approaches (related work) is incomplete. There has been a lot of work on graph visualisation, RDF graph visualisation, summarisation etc, but only some specific approaches are (partly) mentioned in the paper in Section 3.
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The paper's evaluation is far from complete. In particular, the paper is missing a concrete evaluation of the user satisfaction from the tool. Some vague non-quantified statements exist, but this is not enough, especially for a tool that tries to improve user experience.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The paper's evaluation is far from complete. In particular, the paper is missing a concrete evaluation of the user satisfaction from the tool. Some vague non-quantified statements exist, but this is not enough, especially for a tool that tries to improve user experience.
(OVERALL SCORE) The paper describes a tool for visual sense-making and navigation in ontologies. The paper is missing crucial information, lacks an appropriate evaluation and is describing the tool's features very briefly.
Table 1: not all of the approaches mentioned there are obvious to the reader. Some discussion should be devoted to explaining the ideas behind these works.
The study of existing relevant approaches (related work) is incomplete. There has been a lot of work on graph visualisation, RDF graph visualisation, summarisation etc, but only some specific approaches are (partly) mentioned in the paper in Section 3.
The abstract mentions that the search functionality of the tool can run "with or without reasoner". In the paper, I could not identify an elaboration of this statement.
I would appreciate some screenshots of the proposed tool in action.
The paper's evaluation is far from complete. In particular, the paper is missing a concrete evaluation of the user satisfaction from the tool. Some vague non-quantified statements exist, but this is not enough, especially for a tool that tries to improve user experience.
- "also incorporated provide filtering"
Strong points
- Can work with only a partial loading of the ontology, which allows the tool to load big ontologies.
Weak points
- Unclear approach
- Insufficient evaluation
- Incomplete related work
Questions to the authors

Review 3 (by Christian Mader)

(RELEVANCE TO ESWC) The paper claims to propose an "efficient navigation tool for [...] large ontologies". However, the contributions are not clear at all. No research questions have been framed and no particular navigation and visualization tasks and challenges that are addressed by the proposed prototypical tool are provided. In the paper, not even a screenshot or a link to the tool that the author claims to have developed is given, so it is impossible to assess the work in a realistic way.
(NOVELTY OF THE PROPOSED SOLUTION) The author describes a tool that uses the existing strategy of "key concept extraction" (KCE) to improve ontology loading speed and ease navigation. It builds on existing work, which has also been stated by the author. Particular improvements to the KCE algorithm have not been reported. Furthermore I cannot find any novel contributions with relation to the mentioned navigation and visualization methodologies.
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The evaluation only focuses on loading time of different ontologies by various tools. The author shows that the proposed tool can load 7 out of 8 evaluation ontologies, whereas all other 4 tools fail to load 4 of these ontologies. For the remaining 3 ontologies, the tools with which the author compares the approach give a more mixed picture. One tool fails with 2 more ontologies, the others are for some ontologies faster and for some slower than the solution proposed by the author. Sometimes loading speed is even only 0 (zero) seconds which I find very questionable. I would also have expected an evaluation of how actually implemented navigation tasks supported by the author's tool compare against the existing tools from a usability point of view (not only load time), but this is not provided.
(EVALUATION OF THE STATE-OF-THE-ART) There is no dedicated Related Work or State of the Art section. The author provides a general coverage of ontologies, OWL and related tools (reasoners, OWL API) and shortly covers navigation related tools and methods. The paper fails to provide an overview of recent developments in the field ontology navigation methods and mainly targets on KCE, without making clear any alternative methods or the actual advancements the work contributes.
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) Based on the very poor evaluation and non-existent other documentation of the approach (screenshots, source code, live deployment) the properties of the approach cannot be assessed.
(OVERALL SCORE) The paper describes a prototypical tool that claims to support efficient navigation in large ontologies. It does so by only loading a subgraph of the ontology (key concepts) in memory and expand/collapse that subgraph depending on the user's zoom and focus. The author shows that the tool can load 7 large well-known large ontologies (e.g., snomed, nci thesaurus) while 4 other compared tools (3 protege-based and KC-Viz) run out of memory with four of these ontologies.
Strong Points:
1) Tool adopts a useful existing method (KCE) to reduce memory consumption on initial ontology loading
2) few typos
3) author got some writing practice
Weak Points: 
1) The contributions are not clearly stated, no research questions are provided. 
2) The proposed tool's actual feature set remains unclear and advancement of the state of the art is not lined out.
3) Paper structure unclear, existing work and approach are somewhat mixed up.
4) No way to reproduce or review the work (no screenshots, link, code or deployment provided)
5) Tool development methodology (sprint scopes) not of interest
6) Poor evaluation: why decision for the mentioned ontologies? Are the zero seconds in the table really true? 3 of the four tools for comparison are protege based. Bar chart provides no additional information.

Metareview by Christoph Lange

The reviewers agree that the novelty of this work is not obvious and that the research contribution is not explicit.  There is no serious evaluation of the concrete user satisfaction, which would be essential here.  The approaches (yours and related ones) need to be presented in a more comprehensive and self-contained way.