User-Centric Ontology Population
Author(s): Kenneth Clarkson, Anna Lisa Gentile, Daniel Gruhl, Petar Ristoski, Joseph Terdiman, Steve Welch
Full text: submitted version
Abstract: Ontologies are a basic tool to formalize and share knowledge. However, very
often the conceptualization of a specific domain depends on the particular
user’s needs. We propose a methodology to perform user-centric ontology population that efficiently includes human-in-the-loop at each step. Given the existence of suitable target ontologies, our methodology supports the alignment of concepts in the user’s conceptualization with concepts of the target ontologies, using a novel hierarchical classification approach. Our methodology also helps the user to build, alter and grow their initial conceptualization, exploiting both the target ontologies and new facts extracted from unstructured data.
We evaluate our approach on a real-world example in the healthcare domain,
in which adverse phrases for drug reactions, as extracted from user blogs, are aligned with MedDRA concepts. The evaluation shows that our approach has high efficacy in assisting the user to both build the initial ontology ([email protected] up to 99.5%) and to maintain it ([email protected] up to 99.1%).
Keywords: human-in-the-loop; neural network; ontology alignment
Review 1 (by Irene Celino)
(RELEVANCE TO ESWC) The paper addresses an important issue in ontology population and maintenance (NOVELTY OF THE PROPOSED SOLUTION) To my best knowledge, the solution the authors propose is novel wrt to pre-existing approaches (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The proposed solution is clearly and fully explained and then applied (EVALUATION OF THE STATE-OF-THE-ART) As far as I know (but I'm not an expert) the related work is correctly reported and the proposed work framed in the context (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The evaluation is complete wrt the stated objectives and demonstrates the capabilities of the proposed solution (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) In order to make the experiments reproducible, the manually annotated dataset should be made available (and it seems it would be a very interesting artifact for the community!). Some details of the experimental protocol are not fully clear as detailed below (OVERALL SCORE) I liked the paper and I recommend it for acceptance, still I think that it could be improved in a number of points. First of all, while I appreciate the contribution, it seems to me that to fully demonstrate the stated goal "to assist the user in achieving their goals more efficiently and effectively" a further step is missing. Referring to the alignment step, for example, the authors' approach can provide the users with a list of 10 target concepts, being confident that the correct one is in the top-10; still, the demonstration that the users will then pick the "right" one in the top-10 is to be demonstrated; therefore I'd recommend a follow-up study in which a set of users are actually given the automatic alignment output and check their precision in the final refinement step. As a consequence, I'd recommend the authors to soften a bit their claims wrt the users' support. Secondly, I think that, while the paper is generally clear and easy to read, in some points the language is a bit confusing. In particular, I think the authors make use of the words "concept", "conceptualization" and "instance" in an ambiguous way. Table 1 for example has a column of "user's conceptualization" but - if I read the text correctly - those are the "instances" to be classified (the text refers to "surface forms"). I'd recommend to use the word "concept" only for "ontology classes" to avoid misunderstanding. Furthermore, the experiment section, especially in the second part §4.2, lacks some details that I think worth adding. In "adding new concepts", the authors discuss about 500 "incoherent" instances and 500 "coherent" instances (wrt the already aligned ontology). Then the authors report precision/recall/f-score only for the first 500 (without even commenting the results: are they good? can they use any baseline or comparable competing approach? are the computed entropy values "interesting"?), while they don't report about the other 500 (what was the precision/recall/f-score of *not* proposing to add a new concept? why do the authors introduce the 500 "coherent" instances if then they don't use them?). In "reassigning instances", it is completely unclear what data was used, where it came from, what the expected result was and whether only a single user (even if very expert) was employed to evaluate the results. This is the most unclear sub-section of the entire paper. Finally, as stated above, I'd recommend the authors to make their data available to the community, since it seems that the manually annotated set used to compare the different approaches is a very valuable resource; sharing it would also ensure the full reproducibility of the illustrated experiments. *** after authors' rebuttal *** I thank the authors for their clarifications. I confirm my "accept" overall score.
Review 2 (by Hideaki Takeda)
(RELEVANCE TO ESWC) The paper identifies the problem when applying ontologies in a real situation and shows a solution for it. It is an important issue in ESWC. (NOVELTY OF THE PROPOSED SOLUTION) The novelty is to propose the intermediate way between the top-down and bottom-up when applying machine learning. It is not novel as machine learning technique but it is valuable to apply machine learning to the real problem. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The solution itself is well described. But the problem is the circumstance applying the method is not clear like similarity between human and ontology's conceptualization and similarity between instances in the extracted data and the ontology and so on. (EVALUATION OF THE STATE-OF-THE-ART) The identification of the problem is unique and the solution by combining machine learning and human-in-the-loop is a good originality as a solution. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The single use case is weak to validate the purposed method: Although it is good that the detailed comparative evaluation is given, validity of the method, in particular, the generality of the method is not shown in the experiment. As mentioned above, the result deeply relies on similarity of conceptualization of users and the ontology. It is important to show that it can work or the result varies if difference between two types of the conceptualization is more. For example, it would be more persuasive if there are the experiments with different knowledge-level users. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The data in the experiment is not open to check the reproducibility. The system itself is too. (OVERALL SCORE) The paper proposed a new method to use existing ontologies in a given dataset with users interaction. The user defines groups of instances and the level of the specific ontology. Then the system suggests appropriate concepts in the ontology to match the given groups of instances. The novelty is to propose the intermediate way between the top-down and bottom-up when applying machine learning. The result shows that it works well, for example, [email protected] at level 4 is over 66%. The strong points are (1) to attack the practical issue and solve it, i.e., Even if the users want to use ontologies for their data, the large ontologies are difficult to understand. The proposed method does not require users to read and understand the whole ontologies but simply understand the given concepts to match their data. (2) to propose the practical solution with human-in-the-loop: The authors don’t want to pursue computationally simple and beautiful solution, rather to find the best combination with carefully examined application of machine learning techniques and human effort. And it seems to work well. (3) The applied case is the real problem with a real scale, neither like just a tool data nor a computer-generated data. The week points are (1) The limitation of the proposed method is not clear when applying it. There seem many implicit assumptions to make the proposed method to work successfully. For example, the users must be knowledgable to levels in ontologies, otherwise they cannot specify the appropriate level in ontologies. The conceptualization of the users and ontologies must be similar. The system assumes that such ontologies exist and be findable. The users also must be knowledgable at least to judge whether instances are suitable to be included in the given concepts. The paper assumes some specific situation. But the specificity is not clear given. So it is difficult to judge how the proposed method is generally applicable. (2) The single use case is weak to validate the purposed method: Although it is good that the detailed comparative evaluation is given, validity of the method, in particular, the generality of the method is not shown in the experiment. As mentioned above, the result deeply relies on similarity of conceptualization of users and the ontology. It is important to show that it can work or the result varies if difference between two types of the conceptualization is more. For example, it would be more persuasive if there are the experiments with different knowledge-level users. (3) Ontology maintenance is not so clear in the paper. It is also a practical issue so that it is good to show the solution. It seems that the process is also done with the human-in-the-loop. But the role of the human is not clear. Question: In Ontology maintenance, the authors show “reassigning instances” process. It seems that it is applicable even when the initial matching of user-defined group to ontology concepts. Is there any reason that it is not applicable in such a way? Or do you have any results to do it? It seems that the process is a hint to evaluate difference between human and ontology’s conceptualization.
Review 3 (by anonymous reviewer)
(RELEVANCE TO ESWC) The paper is concerned with ontology construction, a central topic to ESWC (NOVELTY OF THE PROPOSED SOLUTION) The approach emphasizes a semi-automatic method that includes the human in the loop. The assumption is that all previous approaches in ontology construction (population) did not - which is incorrect. Most previous approaches assume a semi-automated way of using automatically derived output with human experts using this in the context of a bigger workflow. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The approach is worked out well and appropriate experiments have been implemented (EVALUATION OF THE STATE-OF-THE-ART) Experiments have been implemented that include evaluation on three different aspects of ontology construction (align concepts, identify new instances, identify requirement for a new concept). (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The approach is presented in a language independent way, i.e. the authors claim that no 'NLP' is needed and can be used across different languages and domains. The authors probably refer here specifically to 'NLP' as use of linguistic features, as they do use a word embedding approach which is also an NLP based approach. However, the more important issue here is that the authors assume that a word embedding approach is language independent, which it is not as morphology (word structure) plays a role as well as syntax (word sequence). The approach can not be easily transferred from English to other languages that have a more complex morphology and/or syntax. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The approach is reproducible. (OVERALL SCORE) The paper presents an approach to ontology construction that uses an automatic method for identifying relevant concepts from existing ontologies as well as a human in the loop to guide this process. The paper is well-written, the approach is worked out well and reproducible and appropriate experiments have been implemented. The authors claim their approach to be language independent, which it is not as morphology (word structure) plays a role as well as syntax (word sequence). The approach can not be easily transferred from English to other languages that have a more complex morphology and/or syntax. I acknowledge the author response. The authors further claim their approach to be language independent as word embeddings can be generated for many languages. Although this is true, it does not imply that the approach is language-independent (this would mean the approach can work for any language) and also does not acknowledge that even though an approach can be applied to many languages, this does not mean it performs equally well across these. This needs to be shown.
Review 4 (by Steffen Lohmann)
(RELEVANCE TO ESWC) The paper proposes a methodology for populating ontologies by involving users in the process of building, connecting and maintaining these ontologies. The methodology applies a new approach for mapping and aligning the users' conceptualization of a domain with the available knowledge/ontologies. It also assists the user in maintaining the initial ontology (e.g., creating concepts, splitting or merging concepts,..) (NOVELTY OF THE PROPOSED SOLUTION) The main contribution is the alignment process which is part of the proposed methodology. The alignment process adopts a novel hierarchical classification to map between the users’ conceptualization and the target ontology. The authors used three machine learning approaches for identifying the best matching concepts in a target ontology. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The solution addressed the problem and challenges that were defined in the paper. Each step of the proposed methodology was explained clearly. The alignment process was achieved by more than one machine learning approach to ensure that the best match was selected. The approach used uncertainty measure when it was necessary to predict the classifiers. The methodology was evaluated by a real case study from the medical domain. The performance results were reported and compared by three baseline methods and evaluated to the metric [email protected] (EVALUATION OF THE STATE-OF-THE-ART) The state of the art and related work were categorized and analyzed well. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The proposed methodology was explained in detail, the methodology was elaborated through an experiment (a real case study from the medical domain), the evaluation and results were thoroughly discussed. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The experiment was done in the medical domain, specifically addressing the problem of Adverse Drug Reactions; an expert of the domain helped with the data and knowledge. The methodology is flexible and can be applied to other domains with the help of users/experts of the new domain. (OVERALL SCORE) The paper is addressing the problem of mapping and reusing ontologies by involving users in this process. The main contribution of this paper is the hierarchical classification which is implemented by machine learning approach to select the best matches in the target ontology. Also, the methodology continues with maintaining the ontology by using the approach for adding new concepts and instances. It is not convincingly shown that the approach does not expect any Semantic Web knowledge from users, as, for example, the user is supposed to select a level to perform the alignment. I believe this needs some level of expertise not immediately available to all users.
Metareview by Christoph Lange
The reviewers agree that this paper addresses a practical problem of high relevance, with a clear methodology. However, the approach has just been evaluated in a single, too specific setting setting, which does not prove the generality of the method. On a related note, I (the senior meta-reviewer) second Reviewer 3's request to point out more precisely what is language-independent and what's not. Some limitations and implicit assumptions are not clear. A certain lack of detail, clarity and precise conceptualisation was discussed, and acknowledged in the authors' response, i.e., we trust you to implement this.