How my actuator can understand what your sensor says- Revisiting the Web’s Architectural and Linked Data Principles for Legacy Services and the Web of Things
Author(s): Maxime Lefrançois
Full text: submitted version
Abstract: RDF aims at being the universal abstract data model for structured data on the Web. However, the vast majority of web services consume and expose non-RDF data, and it is unlikely that all these services be converted to RDF one day. This is especially true for sensors and other devices in the Web of Things, as most RDF formats are verbose while constrained devices prefer to consume and expose data in concise formats. In this paper, we propose an approach to make these services and things reach semantic interoperability, while letting them the freedom to use their preferred formats. Our approach is rooted in the Web’s architectural principles and the linked data principles, and relies on the definition of RDF presentations, which describe the link between RDF graphs and their representations. We introduce the RDF Presentation ontology (RDFP) that can be used to model inputs and outputs of procedures of the W3C and OGC joint SOSA/SSN standard ontology, and inputData and outputData of interaction patterns of things in the W3C WoT Thing Description ontology. We then propose practical solutions for web agents to be able to discover how a message content can be interpreted as RDF, generated from RDF, or validated, with different Web interaction protocols.
Keywords: interoperability; web architecture principles; linked data principles; web of things; RDF lifting; RDF lowering; RDF validation
Review 1 (by Armando Stellato)
(RELEVANCE TO ESWC) The author proposes to extend linked data principles (by means of introduced concepts such as presentation, RDF lifting and lowering..) in order to take into account the communication needs of web agents that cannot communicate directly through common RDF formats (or even by text at all, resorting to binary formats for reason of conciseness). This addresses the core of the web architecture and, given the focus on RDF, is thus central to the themes of ESWC. (NOVELTY OF THE PROPOSED SOLUTION) The proposed approach is, at the best of my limited knowledge on the specific topic, novel and relevant. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) I understand the expressed needs and agree on the principle of the proposed solutions. Nonetheless, I found it mostly difficult to go through the start of the paper (even though it is mostly informal with respect to the later content) as I had a general feeling of a inadequate terminology and definitions (see my detailed comments in the review) (EVALUATION OF THE STATE-OF-THE-ART) The paper has novel content and is thus quite original in its proposal. The reference to the state of the art is more than adequate. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The introduced notions and extensions to the LD principles have been properly discussed and motivated. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) Well, this mostly does not apply, there’s no experiment. (OVERALL SCORE) In this paper, the author proposes to extend linked data principles (by means of introduced concepts – or re-formalization of existing ones – such as presentation, RDF lifting and lowering, validation..) in order to take into account the communication needs of web agents that cannot communicate directly through common RDF formats (or even by text at all, resorting to binary formats for reason of conciseness). The author also proposes practical solutions for supporting different communication scenarios using existing formalisms and standards and introduces an RDF Presentation ontology (providing concrete examples of its use in combination with the OGC and W3C joint SOSA/SSN standard ontology) to support his view. I have had two contrasting impressions from this paper: I found it mostly difficult to go through the start of it (even though it is mostly informal with respect to the later content) as I had a general feeling of inadequate terminology and definitions (see my detailed comments in the review), while I agree on the general needs that this proposal aims to cover and found the second part of the paper going to the point and definitely interesting. I would like to see this work published and presented to the conference even though, having to judge basing on a single-shot before the (eventual) camera-ready, I am quite hesitant that it is ready for publication, and the rebuttal may only give answers and not a second version of it, hence my selection for “weak accept”. I hope that the rebuttal (possibly proving me wrong on some of my reserves, which is very possible given that my research is not addressing the core of the web architecture but addresses topics lying on top of it) is enough to make this a convincing acceptance. **** Some minor notes: **** Acronyms: “OGC and W3C SOSA/SSN ontology”. W3C could be assumed to be clear in a paper on ESWC, the Open Geospatial Consortium might not be so explicit to the user, as much as SOSA and SSN. Please expand OGC the first time it is mentioned and properly introduce SOSA and SSN since they are domain specific terms. In the introduction, the sentence: “Yet, in the current context RDF data formats (RDF/XML, Turtle, JSON-LD) will probably never replace the existing ones” should be specified more. What is the current context? If talking in general about current trends (so context is not appropriate), “current” and “never” put together do not convey any concrete meaning. If talking specifically about the context described above, then it should be made more clear (e.g. “context above” instead of “current context” or an even more explicit reference to WoT) While agreeing on the relevance of defining a setting so that communication peers can reach what the author calls “perfect understanding”, I question the choice of the name. The conditions for communication stated in the introduction are not enough in order to insure “understanding” (not even less than “perfect”), unless the author refers to agreement about how to communicate (and, as later explained in section 3, how to be able to properly convey a message so that the RDF graph on both ends is the same), which is on another level. I understand the author’s intention, but the terminological choices (even if coining a new term such as “perfect understanding”) should be as clear and evocative of their meaning as possible. In Section 2, pag. 3 I wpuld recommend to quote statements from the original sources, and then come to the derivations that are instead provided by the author E.g. “A representation of a resource is some data that encodes the content of that resource [(i.e., the information describing an information resource state)].” [16, §3.2] . Ian Jacobs and Norman Walsh. Architecture of the World Wide Web, Volume One, W3C Recommendation 15 December 2004. W3C Recommendation, W3C, December 15 2004. I guess the original sentence from [16, §3.2] is: “representation is data that encodes information about resource state” It would also be good to make it clear if this equality content = resource state is provided in , somewhere else, or introduced by the author **** Weak Points **** * For being a paper providing formal specifications, I found the first sections and the introduction of new terms informal and unprecise (more details in the full review under “overall evaluation”) **** Strong Points **** * a very interesting proposal for addressing heterogeneity of protocols and physical and logical constraints on the content sent by web agents * the paper is intense but not too dense, providing properly described motivations, descriptions of scenario, the general solution and practical examples, plus contributing a resource, the Presentation ontology, and discussing its adoption in existing cases **** QAs **** 1. I’m not sure the expression “more specific” relating IRI lookup with dereferenciation is appropriate. I assume that it is probably true and me not knowing the difference, but at least it should be more clear from the text. The author says that the lookup *can* involve multiple redirections though, as far as I know, even a dereferenciation is not just resolving a URL and could include multiple redirections as well (and the definition reported by the author, the “aim to access an identified resource” should include that). Also, if it is the lookup to be more specific, then we should be informed about the contrary: e.g. a dereferenciation that is not a lookup. I’m not saying the author should provide this example in the paper (i.e. the extension, which at least I can use to understand this difference), but the paper is IMHO missing the intensional part and the reader is not able to grasp the difference. From the Linked Data Book: “Any HTTP URI should be dereferenceable, meaning that HTTP clients can look up the URI using the HTTP protocol and retrieve a description of the resource that is identified by the URI” and “URIs of terms should be dereferenceable so that Linked Data applications can look up their definition” and while the implication given by the “so that” can potentially imply a subsumption between the two terms, no formal difference is given. In http://www.w3.org/DesignIssues/LinkedData.html, among the possible ways of linking data, the first section is called “basic web look-up” and ends with “we call this dereferencing the URI”. 2. The sentence from [16, §3.2]: “representations do not necessarily describe the resource, or portray a likeness of the resource, or represent the resource in other senses of the word "represent" “, reverses the relation of inclusion between representation and description as provided by the author “Let us stress the fact that a representation is a description, but the opposite is not always true.”. While I do not object the statements in (a) and (b), it would help to clarify the matter. 3. I might be easily wrong, but from [8, §1.5] I get that am RDF source does not have to be a web document: a SPARQL endpoint could be an RDF source while not being a web document. Web documents only appear in examples. Also, the expression “essential characteristics” has not been used in formal definitions, only as support to explain some terms.
Review 2 (by Jean-Paul Calbimonte)
(RELEVANCE TO ESWC) Interoperability in IoT is relevant for ESWC. However, the core of this paper is an ontology, which would fit better in the resources track. (NOVELTY OF THE PROPOSED SOLUTION) Several of the proposed ideas were already introduced in a previous paper: Interopérabilité sémantique libérale pour les services et les objets. In EGC 2017. So the overall novelty is quite limited. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) Correctness and completeness are difficult to asses, given that many of the proposed ideas seem to lack some evidence of effectiveness and applicability, and stay for the moment at a conceptual level. (EVALUATION OF THE STATE-OF-THE-ART) Sufficient for the most part. Although there is no dedicated state of the art section. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) General discussion of the presented ideas, although there is little evidence of effectiveness or applicability. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) No implementation or evaluation, other than a simple verification of CQs (OVERALL SCORE) This paper presents an ontology for representing RDF content in different formats, and a discussion on how Linked Data principles can be extended in order to do so. The ontology and the overall idea is centered on Web of Things scenarios. The paper introduces a discussion and proposes potential solutions regarding interoperability in WoT scenarios. The manuscript in its current form is not too clear and not very easy to follow, as it mixes different concerns and makes it hard to understand what is its primary goal. Although the title and most of the introduction point to the current challenges on using semantic Web (and specifically RDF) for data representation in IoT, later on the paper singles out the RDFP ontology as the main contribution. To me, it would be advisable to rethink the structure of the paper and make it clear if it is an ontology paper (for which the resource track would be a better fit), including all the criteria for publishing such type of resource. Or, on the other hand, if the focus is on the architectural proposal, then the approach would require a full validation and evaluation in order to test the research hypothesis. Having said that, the current content of the manuscript fits more with the structure of an ontology (resource) paper. It would be interesting to see some of these ideas in action, so for a future work it might be important to provide a prototype or implementation, and an accompanying evaluation to validate the initial claims. Otherwise, the paper ideas stay at the conceptual level, and again the ontology would remain as the main product of this paper. Another important discussion concerns the use of binary RDF representation and in particular compression for RDF, like HDT or its streaming version ERI, which is supposed to be applicable for IoT scenarios. Other initiatives like zstreamy could also be of interest in this regard, and would probably be worth discussing. Strengths: - identifies limitations of LD for IoT scenarios - provides an openly accesible and documented ontology - aligns with the relevant works of WoT and OGC/SSN. Weaknesses: - confusing structure/organization o presented ideas - limited novelty wrt. previous paper on the topic - no implementation/Evaluation to validate the presented ideas Questions: 1. how does rdf compression fit with the RDF presnetation model described here? 2. what is the cost of the 'lifitng' processes in IoT data exchange? 3. in case of 'trusted lifting', as mentioned in the paper, what mechanisms guarantee this trust? Is there a considerable overhead for this additional mechanism? 4. what is the specific novelty/new contributions wrt. previous work: Interopérabilité sémantique libérale pour les services et les objets Minor: Step 3: you mention the "scope of the ontology". But nowhere before in the introduction it is said that an ontology will be proposed and why. Most of section 2.1, definition of WW, resource, etc., seems unnecessary for ESWC audience. Just a reference would suffice in my opinion. Competency questions asking how …etc. are not sufficiently unambiguous. CQs are usually quite more targeted. After rebuttal: thanks to the authors for the comments. I keep my initial assessment, which doesn't change substantially after the response.
Review 3 (by anonymous reviewer)
(RELEVANCE TO ESWC) The paper targets RDF presentations for stream data. While it builds on top of LD principles the paper would also be suitable for the Mobile Web, Sensors and Semantic Streams track. (NOVELTY OF THE PROPOSED SOLUTION) The main points of the paper are a formalization of RDF presentations, extension of the LD principles for the WoT domain and the RDFP ontology. The ontology itself has been already already prior to the paper. The formalization is high level and lacks details (e.g. what do authors mean by a homogeneous relationship and how one defines the essential characteristics of a RDF source). (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The paper defines a formalism to translate sensor data to and from RDF, and correctness of such translations are left for the underlying representations used. Not sure this item applies to this paper (weak accept chosen as neutral score) (EVALUATION OF THE STATE-OF-THE-ART) The proposed formalism is based on existing technologies for its implementation, and the paper makes mention of a few related work but no comparison to existing state of the art is given. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) Different scenarios in which the RDF presentations can be used as described but not demonstrated. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) An experimental evaluation is not given. The feasibility of the implementation is not addressed, e.g., how time is added to retrieve representations and apply lowering/lifting rules. (OVERALL SCORE) The paper proposes RDF Presentations which are aimed to be a link between low level stream data and their RDF representations. The goal is to allow semantic interoperability without overloading low level devices with verbose RDF representations. Strong points SP1: The approach is timely and has a real life application SP2: The approach is flexible as it does not impose the use a particular technology. Devices are free to choose their representation formats. **** Comments to authors' response **** I acknowledge the response from authors that clarify a few items mentioned by reviewers. Nevertheless, I still think the paper needs more work to be ready for publication, therefore my score remains. WP1: High level formalization. No examples WP2: Feasibility of the implementation not discussed, e.g. latent times to retrieve representation, performing lowering/lifting rules. QA: As the RDFP ontology is already available since a couple of years and mentioned in the SSN W3C Recommendation, I understood that it is not part of the novelty contribution of the paper. If that is not the case please clarify.
Review 4 (by anonymous reviewer)
(RELEVANCE TO ESWC) Topic is of relevance to ESWC (NOVELTY OF THE PROPOSED SOLUTION) The novelty is not clear from my reading of the paper. There may be something there (matching of representations to resource-constrained devices), but the paper has clearly demonstrated it. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) No evaluation of the work was provided for validation. (EVALUATION OF THE STATE-OF-THE-ART) There was not an explicit section on related work (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) Only a limited discussion of the approach is provided. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The ontology is available.However, no evaluation is performed. (OVERALL SCORE) The author tackles the problem of semantic interoperability at data level (due to data heterogeneity). The work tries to bridge the gap of understanding of the data that agents exchange between themselves. The author proposed the RDF Presentation Ontology (RDFP) that can model input and output of procedures by extending existing definitions from W3C documents. The author conceptualizes: RDF Presentation and their Lifting, Lowering and validation procedures. He proposes an extension of the linked data principles. The author presented the RDFP ontology aligning with SOSA/SSN and WoT TD. The ontology formulation is on the basis of his proposed competency questions that consists of various interaction scenarios between publisher and subscriber. Finally, the paper tries to realise scenarios to discover information about RDF presentations directly or indirectly by adding fields in headers of protocols like HTTP. I found the paper difficult to read, I would suggest some improvement in readability: - The competency questions only come at page 9. It would be helpful if they were sooner in the paper. - Towards the end of section 2 you start to discuss a possible contribution. I would take this out of the “fundamentals” section. - I would be interested in a deeper justification for the need for further principles of linked data. - The rationale for the “lowering” and “lifting” could be expanded. An example to motivate it would be very useful - The language in section 3 is loose. It would be easier to understand if you were more specific - How would you evaluate the work? Were all the competency questions satisfied? - An explicit section on the SOTA would be nice to see the contribution of this work Strong Points - The concept of RDF presentation - The author describes the scenarios with competency questions and then proposed his Presentation Ontology. Weak Points - The core problem of the paper is difficult to understand. I am not 100% sure of the exact problems tackled. - The WoT and constrained device angle add further complexity to the paper. - I had to read sections multiple times to understand the message. Some diagrams could make it easier to read Question to Author: What is the relationship to approaches such as Klímek, J., & Necaský, M. (2011, March). Generating lowering and lifting schema mappings for semantic web services. In Advanced Information Networking and Applications (WAINA), 2011 IEEE Workshops of International Conference on (pp. 29-34). IEEE. ----- Added after rebuttal: I acknowledge the authors' comments to reviews provided in their response letter and I confirm my scores.
Metareview by Jorge Gracia
According to the reviewers, the approach described in this paper, which tackles the problem of interoperability in WoT scenarios, is timely and potentially useful. However, the author fails in the way the work is communicated: the paper is not well structured, the discussion is unclear and difficult to follow, and therefore difficult to judge. Furthermore, there is no real evaluation/validation of the ideas in the paper. The reviewers have not substantially changed their judgement based on the authors' reply letter.