Paper 112 (In-Use track)

Exploring Enterprise Knowledge Graphs- a Use Case in Software Engineering

Author(s): Marta Sabou, Fajar J. Ekaputra, Tudor Ionescu, Jürgen Musil, Daniel Schall, Kevin Haller, Armin Friedl, Stefan Biffl

Full text: submitted version

camera ready version

Decision: accept

Abstract: When reusing software architectural knowledge, such as design patterns or design decisions, software architects need support for exploring architectural knowledge collections, e.g., for finding related items. While semantic-based architectural knowledge management tools are limited to supporting lookup-based tasks through faceted search and fall short of enabling exploration, semantic-based exploratory search systems primarily focus on web-scale knowledge graphs without having been adapted to enterprise-scale knowledge graphs (EKG). We investigate how and to what extent exploratory search can be supported on EKGs of architectural knowledge. We propose an approach for building exploratory search systems on EKGs and demonstrate its use within Siemens, which resulted in the STAR system used in practice by cca. 300 software architects. We found that the the EKG’s ontology allows making previously implicit organisational knowledge explicit and this knowledge informs the design of suitable relatedness metrics to support exploration. Yet, the performance of these metrics heavily depends on the characteristics of the EKG’s data. Therefore both statistical and user-based evaluations can be used to select the right metric before system implementation.

Keywords: software engineering; software architectural knowledge; enterprise knowledge graph; exploratory search


Review 1 (by Carlos R. Rivero)


This paper presents a semantic-web solution to enrich software architecture products so more complex searches can be performed.
This is the type of paper I was expecting in this track: the authors present how semantic-web technologies can be leveraged to search for documentation regarding software architecture. There is a system implementation with this concept that has been used in a real-world scenario in a company.
Please, avoid using so many acronyms: AK, AKM, KG, EKG, AQ, ES, SCT, AE... so confusing! It was very hard for me to find what “cca.” means.
---- After rebuttal ----
I acknowledge comments from the authors. I am keeping my score.


Review 2 (by Paul Groth)


Comments after rebuttal: I thank the authors for their response. I look forward to seeing the ontology made available. 
This paper describes a system for supporting expiatory search for software architecture elements using an enterprise knowledge graph. The system is deployed within Siemens and was evaluated with a small user study (8 people) as well as a statistical comparison of the considered relatedness metrics. 
Overall, I enjoyed this paper. Software architectural knowledge is complex and has many different levels of semantics that need to be considered ranging from applicability to a given software design to the more non-functional requirements that particular patterns or design documents cover. The engagement with Siemens in their existing architectural knowledge management system makes for a compelling in-use example. The paper is well situated within the literature. 
There are places where I think the paper could improve:
1) The connection between enterprise knowledge graphs and architectural knowledge could be stronger. Is the assumption that architectural knowledge is part of an enterprise knowledge graph? I think the paper could be stronger by making clear that the lessons learned really apply specifically to architectural knowledge - enterprise knowledge graphs.
2) The connection between relatedness metrics and the task is a bit unclear especially in the first part of the paper. It takes until page 5 to understand that relatedness metrics are driving expiatory search and that expiatory search is the key task for software architects when using these search systems. 
3) Probably the biggest gap, in my opinion, is the selection of the evaluation criteria. The research questions are focused on the specific features of the system described (e.g. explanations, relatedness metrics) but why not a RQ about improvement in performance on architectural knowledge search tasks? How much better do architects perform on a given task? Or, at least, how do they perceive the systems? Do they like it better than there current system?
4) It would have been great to provide detailed usage numbers, for example, how much do the specified 300 users hit the semantic search facility? Are they using it frequently?
5) Would it be possible to make the ontology developed for the STAR system available? That looks like an extremely useful artifact. 
In summary, a nice in-use paper addressing an interesting area of practice.
Minor comments:
* Can you give examples of meaningful features on p.4?
* In formula IC(d), you should define how to read "ae.addresses.d".
* In formula IC(d), you should say what the -log is doing.
* last page "(semi- )automatic" is oddly formatted.


Review 3 (by Ali Khalili)


This paper investigates and evaluates exploratory search in architectural knowledge repositories, based on the STAR approach, ontology, and system, in support of, e.g., weakly defined information tasks and serendipitous knowledge discovery.
This research addresses a relevant open challenge in architectural knowledge management.
I think it is a good idea to compute metrics for quantifying the strength of relatedness between architectural knowledge elements.
This paper reports clearly on the research, which itself is relevant and original. 
My main concern is that most semantic relationships (object properties) in the STAR ontology in figure 2 are <hasEffectOn>.
A <hasEffectOn> relationship between AK seems appropiate for computing a uniform relatedness metric for an exploratory search system.
However, it does not seem suitable for targeted AK retrieval, in which more well defined questions are answered. 
Relationship <hasEffectOn> lacks semantics compared to (Req)<addressed by>(decision), (decision)<satisfies>(Req), (AE)<badFor>(QA), (AE)<goodFor>(QA), <implements>, etc...
Moreover, the end of section 6 and start of section 7 seems to indicate that the lack of semantic (meaningful) relationships also negatively affects the evaluation of exploration tasks ("Useful suggestions were collected for improving explanations, most of them
(e.g., (2), (4) in the following list) revealing the need for more sophisticated
semantic analysis on more detailed semantic annotations." and "However, the solution lacked more
detailed information about the relation between AKs and therefore provided
limited support to explore the AK collection in more depth."). 
I strongly recommend to more explicitly discussed this in the conclusions & lessons learnt section, as an explanation of the evaluation results, limitation of this work, and possible future work.
In future work you could, e.g., investigate the effectiveness of more specific/fine-grained relatedness metrics that are based on the semantics (meaning) of relationships between AK. For example, in your ontology the relationship (decision)<uses>(AE) seems stronger then (AE)<hasEffectOn>(decision) - and <uses> could therefore be assigned a higher weight in the relatedness metric computation. Another (seemingly more difficult) future work/challenge: at the start of Section 5 you wrote "AEs can have either negative or positive effects on AQs" - this could be further quantified for the relatedness metric - some pattern might, for example, have a limited positive <hasEffectOn> AQ security and at the same time a far more positive <hasEffectOn> on AQ reliability.
Other comments:
- Start of section 4 - This seems a use-case or the report of a single application of the STAR approach - this could be more explicit at the start of section 4 or its title.
- Section 4 - on STAR ontology - can you link to (URL on the internet) the source (OWL/RDF(S) file of the STAR ontology in the paper, without instances if this is confidential)
- At the end of section 1 it reads "we report wok in the context of" - should be "work".
- In the abstract it reads "used in practice by cca. 300 software" - should "cca." be "ca."?
---- After rebuttal ----
I acknowledge the authors' comments/rebuttal. I am happy with their response to my recommendation/critique, and with their proposed changes/additions in the paper.


Review 4 (by Daniel Garijo)


------------AFTER REBUTTAL-----------
I thank the authors for their responses. Since it looks like the associated resources of the paper will be made available on the final version, and some of my concerns have been addressed, I will increase my score.
------------ORIGINAL REVIEW----------
This paper introduces a methodology for creating exploratory search systems on concrete enterprise knowledge graphs. The authors showcase their approach with a real use case from Siemens, affecting between 200 and 300 company workers. 
The topic of the paper is not novel, although it deals with a real world problem and it seems to have impacted a significant amount of workers at Siemens. Hence, I think it has important lessons learned that could be useful to others in the ESWC crowd.
The paper is well written, but the constant usage of acronyms make it hard to follow, and I have found several typos (these system support, we report wok).
The resources that are described in the paper are not available. The requirements used to derive the ontology in Figure 2 are not available, the ontology is not available and the platform shown in Figure 5 does not seem to be available. 
Figure 2 does not have a legend, and it is unclear what do bidirectional arrows mean
It is great to see that the work has been used by users at Siemens. However, 200 is not the same as 300 users. Can the authors provide a better estimate? (or reduce the range)
It is unclear to me why related work cannot be used in the context of this paper. Also, there are existing agile methodologies for vocabulary development that could be used for creating an exploratory system. What is the difference between those and the one proposed?
Some of the relatedness metrics don't define how q(a1,a2) is defined. It looks very subjective!
The evaluation only consists on 8 users, although the work is being used by more than 200. Isn't it possible to survey some of the 200 users who actually used the system?
Figure 5 does not explain what are the tags on the bottom right.


Review 5 (by Anna Tordai)


This is a metareview for the paper that summarizes the opinions of the individual reviewers.
This paper fits the In-Use track very well. It describes a system for semantic exploratory search that is deployed at a company, and includes a (small) evaluation. The reviewers point out some minor issues with the presentation. The authors have responded sufficiently to the reviewers' concerns about public availability of resources. The paper would improve from a more in-depth discussion of evaluation results, including limitations of the approach.
Laura Hollink @ Anna Tordai


Share on

Leave a Reply

Your email address will not be published. Required fields are marked *