Paper 193 (Research track)

Ontological Databases with Faceted Search

Author(s): Tadeusz Pankowski

Full text: submitted version

Abstract: Ontological databases merge Semantic Web technologies, mainly ontologies, with relational databases. Thanks to this, it is possible to combine the generality and flexibility of ontologies with the effectiveness of relational databases. The challenging issue is then finding methods for formulating queries in such a system by an end-user. The standard language SPARQL essentially is not devoted for end-users. On the other side, the faceted search gains popularity and becomes a de facto standard in many Web and e-commerce applications. Recently, we witness many attempts to adapt the faceted search in semantic-based applications. In this paper we describe solutions in system DAFO ({\it Data Access based on Faceted queries over Ontologies}). In a systematic way, we describe creating of an ontological database, the role of ontological schema in this system and how this schema can be viewed in a form of a faceted interface that is used to formulate faceted queries. We discuss semantics of faceted queries in first-order logic, a mapping between ontological schema and a relational schema and its exploitation in faceted query translation and execution.

Keywords: ontologies; ontological databases; semantic databases; schema mapping; faceted search

Decision: reject

Review 1 (by anonymous reviewer)

(RELEVANCE TO ESWC) As I view it, the subject tackled in the paper is relevant to ESWC. Bridging the difficulties of query languages (and vocabulary gaps) for end users, and providing them with easier means of accessing the data are long-term goals which still remain to be solved.
(NOVELTY OF THE PROPOSED SOLUTION) I do not see the clear contributions of the paper proposal to the state of the art, neither on the ontological database definition (which could be directly subsumed by R2RML), nor on the adaptation of faceted search on top of an ontology (which has been already addressed in similar ways by Kharmalov, Cuenca, Ferré, Hermann, ...). Besides, there is a lot of previous proposals about database access through ontologies. 
Finally, the author should position the contribution regarding: 
Rewriting and Executing Faceted Queries over Ontology-Enhanced Databases, T. Pankowski
https://www.sciencedirect.com/science/article/pii/S1877050917315430
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The proposal seems to be correct. However, there is one particular aspect that the author has not addressed: faceted search is inherently coupled to the notion of exploration,browsing of facts/data. In the proposal, the property of preventing the user from getting into a dead end (a query without results) is lost and this limitation of the proposal (regarding faceted search) is not discussed.
(EVALUATION OF THE STATE-OF-THE-ART) Regarding query rewriting, OBDA, and ontology mapping to RDBs, there has been a lot of efforts and works. To name a few, we could point to the works by Calvanese on query rewriting and OBDA access (without having to deal with distributed access as the author states), or also the works behind the W3C R2RML by Sequeda et al. The author just mention them without clearly stating where his proposal lies.  
Regarding faceted search / semantic faceted search, I miss the clear comparison against some works on this line of reconciling faceted search with the Semantic Web and ontologies. In particular, apart from the more recent works by Kharmalov, Cuenca, et al. we could expect as well to compare the proposal to the work by Sébastien Ferré and Alice Hermann on this particular subject: 
Reconciling faceted search and query languages for the Semantic Web, S. Ferré, A. 
and all the potential related works/projects systems, such as BrowseRDF, gFacet, SPARKLIS, ... .
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The formalization and examples seem to be enough to easily follow and understand the proposal.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) There is no experimental study of the DAFO interface, which in the end is the implementation of the faceted search proposal on the ontological database.
(OVERALL SCORE) The author proposes a formalization of the notion of ontological database and a formalization of a faceted search method to facilitate the access to the information stored in it. 
Strong points: 
* The proposal is formalized, and correctly presented. 
* The faceted search proposed is implemented. 
Main weak points: 
* The contribution is not clear, lack of novelty and not correctly addressed state of the art regarding the two main claims (ontology database access and semantic faceted search). 
* Lack of evaluation of the proposal in terms of usability. When proposing a search method, its usefulness must be assessed (e.g., via usability studies, or evaluating the improvements in terms of interaction, precision, etc). 
Questions/Comments to the Authors: 
* Why is the expressivity of the ontologies so restricted? We have the OWL 2 QL Profile precisely to allow to store the extensional data in a RDBMS. In the example, Figure 1, given the definition of ontology given, it seems that we could not use a defined concept (an intensional predicate) as domain of a property (e.g., we could not state that writtenBy has range Author). Besides, it could help to specify the expressivity in terms of DL expressivity to improve readability in this community. 
* One of the claimed contributions is the cloning mechanism, but is not explained. 
* The faceted interface as it is allows for the selection of different branches in the tree, which then are translated into faceted searches. However, different branches can be incompatible (in terms of existing underlying data). From the point of view of exploiting the ontology (the schema/intensional knowledge) this might be interesting, but when accessing the data themselves (or during the navigation), this can be confusing for a user that is used to regular faceted search. In my opinion, this difference must be discussed further, and evaluated with final users. 
* Which is the contribution increment regarding: 
Rewriting and Executing Faceted Queries over Ontology-Enhanced Databases, T. Pankowski
https://www.sciencedirect.com/science/article/pii/S1877050917315430
Remarks after the rebuttal: 
The novelty of the paper is still compromised by the previous paper, and the author has not precisely answer about this subject. Besides, I do not agree with the answer about OWL QL profile. Taken directly from the OWL QL description (Owl-profiles W3C document): 
'The OWL 2 QL profile is designed so that sound and complete query answering is in LOGSPACE (more precisely, in AC0) with respect to the size of the data (assertions), while providing many of the main features necessary to express conceptual models such as UML class diagrams and ER diagrams. In particular, this profile contains the intersection of RDFS and OWL 2 DL. It is designed so that data (assertions) that is stored in a standard relational database system can be queried through an ontology via a simple rewriting mechanism, i.e., by rewriting the query into an SQL query that is then answered by the RDBMS system, without any changes to the data.'
This, along with the rest of previous comments, makes me not to change the evaluation already given.


Review 2 (by anonymous reviewer)

(RELEVANCE TO ESWC) The topic is relevant for ESWC. The paper describes a way to query an ontological database using a facetted search over an ontological schema.
(NOVELTY OF THE PROPOSED SOLUTION) The novelty of the approach have not been proven by the author. No comparison to previous work has been presented (c.f. Overall score).
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The author presents only an informal proof of the main theorem which is in my opinion not sufficient for proving the correctness and completeness of the proposed solution.
(EVALUATION OF THE STATE-OF-THE-ART) Related work is presented very briefly with no comparison to the proposed approach.
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The main properties are presented in an exhaustive manner.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) Not applicable due to lack of experimental study.
(OVERALL SCORE) Summary of the Paper
The paper describes a method for querying ontological databases using a facetted interface against an ontological schema. The facetted queries semantics are expressed in first order logic. The author proposes an extension of the expressiveness of the facetted queries in order to cope with the limitations related to unary and binary predicates interpretation.
Strong Points (SPs)
The topic of the paper of relevance for the semantic web and database communities.
The paper introduces the main concepts in an exhaustive manner.
The proposed solution although not evaluated is original and appropriate for the considered problem.
Weak Points (WPs)
The paper describes related work in a very brief manner and presents no comparison to related work. The author does not present the contribution compared to previous work.
The novelty of the proposed method is not highlighted especially in contrast to the work describing the system that is mentioned as being the solution implemented based on the theoretical contribution in this paper.
Lack of theoretical validation of the correctness and completeness of the proposed approach. The proof of theorem 1 is discussed informally, the author could consider linking to an external source or an appendix.
The description logics expressivity and the parallel to conjunctive queries are not addressed.
The paper presents no experimental evaluation.
Remarks after rebuttal process
The novelty of the paper compared to authors previous work is still unclear especially for the missing reference :
Tadeusz Pankowski, Rewriting and Executing Faceted Queries over Ontology-Enhanced Databases, Procedia Computer Science, Volume 112, 2017 (http://www.sciencedirect.com/science/article/pii/S1877050917315430)
This and the lack of theoretical and practical validation of the proposed approach make me keep my initial decision about the paper.


Review 3 (by anonymous reviewer)

(RELEVANCE TO ESWC) The paper covers ontological databases, a kind of ontology-based system with mappings to RDBs. SPARQL query construction is done in an already existing OBDA-platform (DAFO), and query rewriting is a central part of query execution. The theoretical framework is described using formal logic, and together with the implementation it gives a good mix of theory and real-world applications.
This work falls within at least three of the topics listed under the chosen subtrack:
- Query processing of Semantic Data
- Semantic Searching and Browsing
- Semantic Data Integration
(NOVELTY OF THE PROPOSED SOLUTION) Ontology-based systems, mappings to RDBs, query rewriting and faceted search are all well studied techniques and areas, and the paper is not novel in that regards. The DAFO system is also mentioned in [14] together with a description of faceted interfaces, FOFQs and the query rewriting.
However, the author introduces a formal definition of "ontological databases". This seems to be an incremental step from his previous work [14]. The cloning operation added to the DAFO system is also not found in previous work.
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) Section 3 about ontological databases is quite technical and the description and intuition around the meta-schema mapping is not fully clear to me. That said, everything I managed to understand seemed to be correct.
Section 4 about faceted views and queries described an approach to making faceted queries over the ontology. The query construction, the rewriting and the new cloning operation all seems reasonable to me.
(EVALUATION OF THE STATE-OF-THE-ART) The author has included a lot of work done in the semantic web community, and even recent papers has been included. He refers to already existing OBDA approaches and systems, theoretical work about extended databases, query rewriting and faceted search systems like SemFacet and DAFO (which is his own work).
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The approach is described theoretically and demonstrated in the DAFO system. The examples also give good support. However, I wish the author could have added more supporting text to give a better intuition of the approach, especially for the parts related to the meta-schema and schema mappings.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) - Reproducibility: The DAFO system itself does not seem to be available online, but the underlying ideas it uses can be reused by others. There are no experiments related to the paper except for the implementations done in DAFO, so there is really not much to reproduce. The definitions and theory about ontological databases can of course be tested and questioned by anyone with the necessary technical background.
- Generality: Ontological databases are not restricted to any specific domain, but it is limited to conjunctive ontologies and the RDB constraints described in the paper. The faceted queries described are also quite expressive, at least after including the extra parts given by help of the cloning operation.
(OVERALL SCORE) The paper describes ontological databases, which are RDBs with mappings to an ontology-based global schema. Ontological databases makes a clear distinction between extensional and intensional predicates in the ontology, and only extensional predicates are actually mapped to the relational schema. 
Query construction over ontological databases are based on both extensional and intensional predicates of the ontology, so a rewriting step is needed in order to remove intensional predicates from the query. 
The DAFO system includes a faceted interface, which is based on a sequence of keywords. The faceted interface is used to generate a faceted query, which is rewritten into a query only using extensional predicates. After that, mappings are used to generate a SQL query over the RDBs.
The cloning operation allows the user to make queries with conjunctions of unary predicates, and disjunctions of binary predicates.
* Strong Points (SPs)  ** Enumerate and explain at least three Strong Points of this work**
- It refers to well known researchers and even recent papers from the semantic web community.
- It is theoretical, but it also includes good, easy to understand examples.
- It contains informative screenshots of DAFO, the system which the approach is implemented in. The fact that it is implemented in a real world system is very good.
- The overall structure and division of the textual content is also very good.
* Weak Points (WPs) ** Enumerate and explain at least three  Weak Points of this work**
- The part about meta-mappings and schema mappings was not very clear. I wish the author explained more detailed and shared more of his intuition.
- The reason to include query keywords is not clear. Is it simply because it is a part of the already made DAFO system.
- Maybe the proof to Theorem 1 could have been included in an appendix. It is very informal, and since it was listed as a proof of the theorem, I expected more.
- The English can be improved. "a" and "the" are often missing, which makes parts of the text harder to understand.
* Questions to the Authors (QAs) 
1. What is the main difference between ontological databases and OBDA-systems using R2RML?
Authors’ answer: "OBDA and R2RML are focused on creating an ontology (as a global schema) from an existing relational database(s). In ontological databases, the starting point is an ontology, and the effort is directed to efficient query answering. Then, such issues as first-order rewritability and mechanisms for query formulation are crucial (Gottlob et al. [11]). "
-> The focus in OBDA is not to create an ontology, but to answer queries.  R2RML is used to provide a bridge to relational databases, so the database content is (virtually) transformed to an ABox.  Reasoning through query rewriting is possible as long as the ontology is within OWL QL.
Suggested textual improvements:
Explanation: [remove text inside square brackets] (add text inside parentheses).
Page 1:
The considerations are illustrated by solutions applied in (the) DAFO system.
methods for formulating queries against (the) ontology in such a system
The problem of rewriting queries in such (a) scenario was studied in
Page 2:
One method for query(ing) ontologies is faceted search.
Faceted queries are usually executed [throw](through) transformation
We [proof](prove) that the cloning operation proposed
Page 3:
Further on, we assume that (the) considered ontologies are
Page 4:
The proposed restrictions are those which were verified in implementation of (the) DAFO system
Categories of rules in (the) ontological database DAFO
Page 5: 
[Paper](Person)(x) & authorOf(x,y) & Paper(y) -> Author(x)
Page 6: 
(The) ER schema in Figure 1 can be transformed into (a) relational schema
Elements of (the) ontological schema are mapped to elements of (the) relational schema by (a) meta-schema level mapping.
Page 7:
Any dependency has [a](the) form of (an) implication
tgds are written in (the) target language
Additionally, we assume that in (the) target language
if there is no [a] tuple r in R
Page 8:
The result of executi[on](ng) m against
Note, that a consistent ontology can be mapped into (an) inconsistent relational
(An) [O](o)ntological database is a knowledge base
A structural graph of a fragment of (the) terminological part
A binary predicate with [domain](range) String is called a data property
In such (a) faceted view:
Page 9:
A sample hypertree faceted view of (the) ontology BibOn
Page 10:
In t(h)is way, a hypertree that has n nodes
Finally, a new root node labeled by root is created and all tree(s) are connected with this root node.
Page 11:
a[l](r)e selected.
Page 14:
The set of rules is divided into integrity constraint an(d) rewriting rules
Especially, the crucial (part) is the utilization of RDBS that implements advanced optimization


Review 4 (by anonymous reviewer)

(RELEVANCE TO ESWC) The paper is about ontological enhanced databases (and faceted queries), which fits the scope of ESWC.
(NOVELTY OF THE PROPOSED SOLUTION) The paper introduces an ontology language consisting of rules (FOL implications that meet certain syntactical constraints) and facts. For a class of ontologies it is shown some of the terminological part can be considered as ontological schema of an ontological database. The extensional part of the ontology is instantiated as relational database.  It is further shown how such an ontological database may support faceted queries.
(CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) I don't know how to evaluate correctness and completeness in this context, since the paper does not present a logical system. Thus, the terms correctness and completeness do not apply.
(EVALUATION OF THE STATE-OF-THE-ART) I don’t know the literature and, thus, cannot judge whether the paper is state of the art. I therefore used my overall mark.
(DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The approach was demonstrated with the help of examples. Except of one theorem, there was no discussion of the properties of the approach. The lack of evaluation is a weakness of the paper.
(REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) There was no experimental study.
(OVERALL SCORE) The paper introduces an ontology language consisting of rules (FOL implications that meet certain syntactical constraints) and facts. For a class of ontologies it is shown some of the terminological part can be considered as ontological schema of an ontological database. The extensional part of the ontology is instantiated as relational database.  It is further shown how such an ontological database may support faceted queries. 
The paper is well-structured and mostly easy to follow. The overall impression is marred by significant weaknesses of the presentation. There plenty of grammatical errors, which sometimes lead to incomprehensible sentences. Further, some of formal definitions seem to contain errors. 
For example: 
- in section 2.1 bullet point "2." the variable tuple x' is not bound by a quantifier.
- in section 2.1 bullet point "2.", the text refers to psi prime, but no psi prime is mentioned earlier in the text
- the definition of answer to a PEQ makes no sense.  The text claims that an answer is a set of of constant tuples (instead of a set of valuations, which one would expect). The formal definition does not match that description. 
Definition 4 of schema mappings is very confusing.  On first glance schema mappings look like a set of FOL implications. According to the text a schema mapping consists of "implication[s], where the left-hand side is written in the source language [...] and the right-hand side [...] in [the] target language". However, this is not the case, since the right-hand side contains existential quantifiers that are not part of the target language. These quantifiers seem to be interpreted procedurally. It seems that the author intends to specify an algorithm for a mapping between an ontology language and a relational schema. It would be better to present this algorithm differently. 
I have a general question concerning the general approach presented in the paper. Ontologies first became popular in the biomedical world.  The reasons was that ontologies provided a solution to pressing data integration problem at the time: Research groups across the world maintained databases containing information on genes and gene products, each of them stored their data in relational databases that were designed to fit their unique needs, but prevented easy data integration. Ontologies solved that problem, because it allowed the various research groups to exchange data without having to change their databases. This was achieved, because the knowledge representation in the ontologies was decoupled from way the information was stored. This decoupling is indeed a prerequisite for the reusability of ontologies. It seems to me that the approach proposed by the author undermines the decoupling of data and knowledge representation. In particular, the requirement between "extensional predicates" and "intensional predicates" has no ontological basis but is only motivated by the way the predicates are treated in the database and what facts are stored. Since different people inevitably will make different choices on what predicates are "extensional" or "intensional", this kind of distinction seems to provide an obstacle to interoperability. Doesn’t that undermine the whole point of using an ontology?
Minor:
- In (1) chase_{R_c} is not defined, since R_c is not defined in Example 1. 
- Example 3: it probably should be: att(affiliation)= {Person[Id], ...}


Metareview by Olaf Hartig

This paper presents a (formal) approach to query ontological databases using faceted search. The reviewers agree that the presented work is relevant and that the paper makes it easy to follow and to understand the approach. On the other hand, the reviewers point out a number of issues with the paper. The main concern raised by several reviewers is that the novelty and the relationship of the presented work to state of the art is unclear. In particular, not only is the comparison to related work insufficient, there also is an earlier paper by the same author that has significant overlap with this submission (which does not reference it, and the rebuttal did not help to shed light on the issue either). Other weaknesses of the paper are the lack of an evaluation of the proposed approach in terms of usability and the insufficient theoretical validation of the approach. Due to these issues, the paper cannot be accepted for publication in the conference.


Share on

Leave a Reply

Your email address will not be published. Required fields are marked *