Paper 130 (In-Use track)

Platypus, A Multilingual Question Answering Platform for Wikidata

Author(s): Thomas Pellissier Tanon, Eddy Caron, Marcos Dias de Assuncao, Fabian Suchanek

Full text: submitted version

Abstract: In this paper we present Platypus, a natural language question answering system.
Our objective is to provide the research community with a production-ready multilingual question answering platform that targets Wikidata, the largest general-purpose knowledge base on the Semantic Web.
Our platform can answer complex queries in several languages over knowledge bases using hybrid grammatical and template based techniques.

Keywords: question answering; Wikidata; knowledge base; universal dependencies

Decision: reject

Review 1 (by anonymous reviewer)

The paper presents Platypus, a question answering platform for (not only) Wikidata.
The platform builds on a generic framework with a three step process for query answering, where 1) natural language queries are translated into logical representation using a) transformation rules based on sentence grammars and b) query templates and slot filling, 2)  the logical representations / interpretations are ranked, 3) the logical representations are translated to SPARQL and evaluated over the knowledge base.
All steps are well described with as much technical details as possible in this type of publication.
The work is novel and - to the best of my knowledge - constitutes the only mature and publicly accessible system that works over Wikidata.
Related work is sufficiently discussed. The evaluation is sound, however, it is a pity that the authors did not consider the benchmark specifically targeted to the addressed problem, namely task 4 of the QALD challenge:
Further, the adequacy of the work for the In use track is questionable. The paper is presented like a typical research paper (although for the research track the degree of novelty might be too limited). There is little focus on the application perspective or description of specific use cases. The authors state they want to “provide the research community with a question answering platform …” - this is not the typical kind of application for the In use / industry track. Also, for the research community one would expect an open source publication of the platform.
Despite the question of which track is better suited, I believe the paper should be presented at ESWC.

Review 2 (by Daniel Garijo)

----------------AFTER REBUTTAL----------------
I thank the authors for their answers. After reading the response, I have decided to keep my score:
* The authors don't address the availability of the code of the system, making the approach not reusable. Also, non of the parameters used for training are available.
* I have tried the tool again for Spanish. I confirm that it does not work properly. Some of the queries proposed by the authors are not syntactically correct in Spanish. The first one works (Although people usually say "Donde esta el Big Ben"). The second one ("¿Quién es el población del país de Big Ben?") doesn't make sense in spanish. The right question is: "¿Cual es la población del país donde esta el Big Ben?". And I get 42. I also did a simpler one: "¿Quien es Barack Obama?" (Who is Barack Obama?) and the answer was 42 again. It works for English. 
* The authors have not addressed the impact of the system for this paper.
* The evaluation does not show the advantages of the system versus the others in a way that we can measure. If it's not in terms of performance, maybe the authors should show that the set-up of the system is faster? How can "It does not need to have the full knowledge base available to train the entity lookup system" be shown in the paper in a comparable manner?
----------------ORIGINAL REVIEW----------------
This paper presents the Platypus question answering system. One of the novelties of Platypus is that it can answer complex queries in multiple languages, using Wikidata as main source.
- The paper is well written, easy to follow and relevant for ESWC. It also describes a real problem, as conversational agents currently present much room for improvement. However, according to the criteria for the In Use track, I do not think the paper is ready to be accepted in the conference. I further explain my reasons below.
- Evaluation: It shows that Platypus is the fourth out of five compared systems, making the system not very competitive. Why does Platypus perform so low? The evaluation does not discuss the cons of the presented approach. 
- The code does not seem to be available, and neither are the hyperparameters used when training the dataset. The authors claim that Platypus is extensible, but how can someone extend it if no source code is available? Why aren't other approaches extensibles as well?
- The impact of the tool on the community has not been included or analyzed.
- The novelty of the approach, besides using another knowledge base, is not clear. 
- I have tried the tool in English and Spanish, but only got (some) results for English

Review 3 (by Vanessa Lopez)

The paper presents a multilingual QA system over Linked Data, QA systems over DBpedia and Wikidata has been studied for a while - see the survey [1] on QA systems evaluated both over DBpedia (QALD) and Freebase (now Wikidata) 
The contribution of the paper is that it tackles multi-linguality, just a few systems tackle multi-linguality - the authors are missing some of them in their related work [2][3].
The approach proposed combines the use of rules based on sentence grammars and the use of query templates and training data for slot filling.  The approach is interesting but the performance does not improve with respect to other state of the art systems. 
My main question is why the evaluation is limited to Simple Questions ? there are other evaluation datasets on top of freebase / wikidata with a bit more complex and interesting NL questions (that they should be able to tackle with the logical representations presented here). Also, why the authors focus ONLY on Wikidata , what about other datasets such as DBpedia (evaluated using the QALD benchmarks)? . 
Also, this paper will have a better fit in the research track rather than in the InUse track as there is not a clearly defined use case with users. 
[1]Core Techniques of Question Answering Systems over Knowledge Bases: a Survey. D Diefenbach, V Lopez, K Singh, P Maret. Knowledge and Information Systems (2017)
[2] WDAqua-core0: A Question Answering Component for the Research Community. Dennis Diefenbach, Kamal Singh, Pierre Maret. ESWC, 7th Open Challenge on Question Answering over Linked Data (QALD-7) 2017
[3] AMUSE: Multilingual Semantic Parsing for Question Answering over Linked Data
Hakimov S, Jebbara S, Cimiano P (2017) In: Proceedings of the 16th International Semantic Web Conference (ISWC 2017).

Review 4 (by anonymous reviewer)

This paper details a QA system that is inherently multilingual and operates on Wikidata. The proposed method is available online. It is a well-written and easy to follow paper, and I have very little to critique it. My only concern is that its performance in comparison with other systems is ok but not mind-blowing. But then again, these other systems are not as easily available online.

Review 5 (by Anna Tordai)

This is a metareview for the paper that summarizes the opinions of the individual reviewers.
This well-written paper addresses the under-explored topic of multilingual question answering. The approach is novel and the demo is publicly available. The reviewers point out that it is not clear what the benefit is of the presented system over other systems given the fact that it does not outperform the others in the evaluation. The reviewers also question why the authors have not tested their system on benchmarks such as the QALD Task 4. The main issue with this paper is that has not been deployed and that there are no actual users outside the context of an academic evaluation, which makes it less suitable for the In-Use track of ESWC.
Laura Hollink & Anna Tordai

Share on

Leave a Reply

Your email address will not be published. Required fields are marked *