Linking open public datasets for multifaceted music discovery
Author(s): Alo Allik, Delia Fano Yela, Mark Sandler
Full text: submitted version
Abstract: Music artist recommendation has conventionally embraced the idea of providing users simple lists of suggestions based on similarity. Recommendation systems typically strive for efficiency and accuracy relying most frequently on different variants of bipartite graph filtering, content-based information or a hybrid of both. A more engaging music discovery system built on the principle of multifaceted representation, on the other hand, would arguably benefit from recommendation diversity to provide a novel and enriched experience of exploration. Here we propose an alternative music artist recommendation technique that not only reveals connections between artists, but also the nature of these artist connections to enhance music discovery. For this purpose we have developed a graph-based approach for music artist representation. This involves linking together a number of open public music-related datasets using Semantic Web technologies and Linked Data principles. Different types of data, including music publishing metadata, biographical and socio-cultural information, content-based feature extraction, and crowd-sourced tags, can thereby be combined into an integrated artist similarity graph.
Keywords: linked data; graph theory; music similarity; music information retrieval; machine learning
Review 1 (by Edna Ruckhaus)
(RELEVANCE TO ESWC) Music recommendation systems that make use of Semantic Web techniques which is the focus of this work, is an area relevant to ESWC. (NOVELTY OF THE PROPOSED SOLUTION) Each of the three types of techniques presented in this work is not novel, and the MusicLynx system does not combine or integrate them (which would be a novel approach. The system rather relies on the diversity of the recommendations as the focus of the approach, but the advantages of "diversity"are not clearly supported. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The presentation of the three "subsystems" in sections four, five and six is complete and correct but in some cases hard to follow. (EVALUATION OF THE STATE-OF-THE-ART) The state of the art described in Section 3 is not comprehensive enough. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) Under the light of a comparison to other music reccomendation systems, the advantages of the proposed approach (diversity) could be supported. Additionally, the presentation and discussion of the proposed approach properties is sometimes not clear. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The work would be more conclusive if an experimental study would consider other music recommendation systems. Also, There is no clear indication of the expèriments being reproducible, and the discussion in section 9 is not conclusive. (OVERALL SCORE) This paper presents the MusicLynx Recommendation System (for artists), which uses Semantic Web techniques to link several public datasets in the music domain and uses similarity measures related to the DBPedia categories that the artist belongs to, music content like tonality, rythm, among others, and crowd-sourced mood-related information. An experimental study has been carried out on the diversity index of the similarity measures that have been implemented. (SP1) The work is on an area relevant to ESWC. (SP2) The MusicLynx system has a front-end that allows an end-user to easily use the system. (SP3) The presentation of the three parts of Music Lynx, sections 3,4 and 5 is complete and correct. (WP1) The presentation of the approach is in some cases hard to follow, for example the explanation of the proposed solution for content-based similarity in section 4. An example in all cases would help to understand the proposed solution. (WP2) The recommendation system combines graph similarity techniques using dbpedia categories, content-based similarity, e.g. rythm, tonality, and similarity based on crowd-sourced tags. Each of the three types of techniques is not novel, and the MusicLynx system does not combine or integrate them which would be a novel approach. (WP3) The state of the art described in Section 3 is quite limited. It refers to a couple of systems that use Semantic Web tecniques, it also refers to content-based similarity as only using small datasets in their experiments. The reference to Lanckriet's work would be the one most related to this work, but the contribution of MusicLynx in comparison, does not seem sufficiently strong, only referring to the use of a larger artist pool. (WP4) The presentation of the proposed approach is not clear. There should be a general introduction of MusicLynx before describing open public datasets, this makes Figure 1 out of context as at that point there is no notion that there is a poposed system, MusicLynx, Thus, there is no clear definition of the general contribution of the proposed approach. Diversity is defined (diversity index), but the motivation on stressing it over accuracy and efficiency should be stated. (WP5) Having a software illustrating the approach is a plus. However, in a way it makes evident the disjunctive nature of the two systems (was not able to see any mood-related similarity example). Also, some categories seemed as not related to the music domain, and generate nodes that do not seem useful, maybe not all of the categories in dbpedia should be selected, e.g. looked for a catalan singer and it had nodes on "other Spanish people". (WP6) The experimental study studies diversity among the different similarity methods used. There is no indication of it being reproducible and the discussion in section 9 is not conclusive, e.g. "there as many ways to intepret the results as there are method pairs..." "similarity connections are meaningful just fundamentally different".
Review 2 (by Anett Hoppe)
(RELEVANCE TO ESWC) The work shows how linking a novel combination of knowledge repositories can be used to provide users with more diverse music recommendations. While the combining of different information sources alone might rather qualify for a demonstration, the proposed similarity measures can be considered a scientific contribution. (NOVELTY OF THE PROPOSED SOLUTION) Combining different knowledge resources for a certain task (even music recommendation) is not a novel task and solution. Anyhow, the authors propose a set of similarity measures that go beyond the state of the art. Furthermore, the focus on giving the user more diverse suggestions instead of just similar artists is one that has not been followed in music recommendation -- to the best of my knowledge. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The approach and working steps are comprehensible. Tiny remark: Is “mo:musicbrainz_id” the correct name for the relationship property connecting MusicLynx to BBC music? I suspect it should be “mo:musicbrainz_guid” according to the specification of the music ontology (http://musicontology.com/specification/). (EVALUATION OF THE STATE-OF-THE-ART) There is no section dedicated to the examination of the state of the art, closely related references are discussed directly in the topic sections. This might help to cover that the paper is rather short on references. There should be at least a recent survey paper cited, especially when looking at the sections 3, 4 and 6 there are certainly some recent works which should be mentioned. Furthermore, the works by Sergio Oramas seem similar to your work, for instance: Oramas, Sergio, et al. "Sound and music recommendation with knowledge graphs." ACM Transactions on Intelligent Systems and Technology (TIST) 8.2 (2017): 21. Oramas, Sergio, et al. "A Semantic-Based Approach for Artist Similarity." ISMIR. 2015. (which uses some of the knowledge sources that are used here as well, for instance the DBpedia artist categories). Adding a more detailed discussion of the state of the art is necessary and should be added for the final version of the paper. In particular it has to be shown how the presented work distinguishes itself from the above mentioned examples (and possibly other similar propositions). (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) Working steps and rationale are clear, the paper links a functioning prototype of the solution. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The evaluation exemplarily calculates diversity measures between the used categories and the result is as expected. Anyhow, this does not evaluate the usefulness of the approach as for instance perceived by possible users of the system. Some sentences of interpretation should be added to the discussion of Table 1. For what reason was N set to 17 in the experiment? The authors argue that it might be impossible to give a reliable qualification of “a concept as complex and subjective as artist similarity” (to which I agree). However, given that the method has the objective to support users in discovering new artists, this is the task it should be evaluated on – how much new material can be discovered with the tool (as opposed to existing platform such as last.fm)? how much fun do they have using your visualization? How well can they handle the interactions? Is the diversification of results actually perceived as helpful? Do all dimensions (tags, content-based etc.) have the same impact on user satisfaction? To me, for instance, the way the “artist bubbles” are placed with respect to each other on the canvas is not 100% clear. And I could not find an explanation just by trying your prototype on different artists. (OVERALL SCORE) The presented paper proposes a novel combination of music-related knowledge repositories for music recommendation. The focus is not solely on artist similarity as in other works, rather the objective is to diversify the suggestions. A tool is presented which allows visual exploration of the artist space. Strengths: The paper is easy-to-read and targets an interesting topic, the combination of content-based features and artist attributes seems an interesting one and a suitable application area of graph-based knowledge repos. The working steps are clearly depicted and comprehensible. A running prototype is available which allows actually experiencing the interactive interface. Weaknesses: The biggest weak point would be the treatment of the state of the art – music recommendation (content- and artist-based) is not a novel topic and there is quite a few works which should be cited (one example which actually uses similar knowledge sources is named above, but I would not limit the scope to only graph-based methods for a thorough review of the related work). Furthermore, the evaluation should be extended – it seems reasonable to assume that more diverse suggestions could be more interesting, but so far, reading only the paper, it remains an assumption. The usefulness of the proposed prototype system is not evaluated.
Review 3 (by Tomi Kauppinen)
(RELEVANCE TO ESWC) This work presents a graph-based approach recommendation of music with implementation using semantic web tech and linked data, but does not contribute to the semantic web theory nor a new method (NOVELTY OF THE PROPOSED SOLUTION) Users argue for novelty quite nicely - but this paper is merely a contribution in terms of the use of semantic web rather than a contribution as a novel method. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) Very nice approach, and well evaluated. (EVALUATION OF THE STATE-OF-THE-ART) Section 3 covers the state of the art on similarity modelling to some extent. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) This paper demonstrates and discusses the approach, and compares it with other solutions for music similarity analysis. However, in order to contribute as a novel recommendation approach, a more in depth evaluation with competing recommendation approaches is needed. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The code is in GitHub, and the implementation should be quite doable based on the information in the paper. (OVERALL SCORE) Strong Points (SPs) - the approach is evaluated (although not widely agains competing approaches) - provides a compelling use case (music) for recommendation - the paper is a good read Weak Points (WPs) - this paper is merely a contribution in terms of the use of semantic web rather than a contribution as a novel method. - I would have only hoped for intuitive explanation of results of artist similarity as depicted in figure 3 - lacks a proper evaluation with competing recommendation approaches Questions to the Authors (QAs) - is figure 3 referred in the text? Minor issues: "The The coordinate data" -> "The coordinate data"
Review 4 (by Mathieu D’Aquin)
(RELEVANCE TO ESWC) The paper presents a recommendation approach for music artists which relies on graphs from multiple linked data sources. While the use of linked data sources and of graph processing for recommendation is of interest to the audiance of the conference, the paper is not a research paper with contribution to semantic web challenges. It would certainly fit better in the In-Use track of the conference. (NOVELTY OF THE PROPOSED SOLUTION) The solution presented is similar in principle to many others, as it relies on similarity in the graph connecting artists and different aspects. There are some claims to novelty however with respect to the kind of aspects which are taken into account and the way they are processes. The focus on diversity is also different from those other works, but I'm not sure how it might have been considered however in other recommendation work outside of the semantic web community. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The approach appears correct generally, and is certainly interesting. The way the different sources of data are used is described, but some aspects remain a but unclear (e.g. how sameAs.org is used, considering that it is generally considered as very inaccurate). (EVALUATION OF THE STATE-OF-THE-ART) Other approaches are described briefly only. The discussion on the merits and drawbacks of the proposed approach compared to those others is not particullarly advanced. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The evaluation of the approach considers the measure of diversity of the recommandation, based the use of different properties of the graph. The authors argues that evaluating this aspect empirically is difficult, I would still have wanted to see how the results compared with other approaches to recommanding artists. That the recommandation differ a lot when different properties are considered, as stated, if I understand well, in the conclusion, does not seem to me as necessaraly being a positive outcome. The quality of the recommandation is not evaluated. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The methods seems sufficiently described that it could be applied on the same (open) sources or others, although there are parts which are not very precise. I would have like to see a discussion on whether the approach was generalisable to recommandation of other kinds of things than music. (OVERALL SCORE) The paper presents a recommandation methods which favours diversity by using different aspects of a graph connecting artists from multiple linked data sources. This is potentially very interesting in showing the value of linked data, and the approach seems generally valid. The contributions to semantic web research are however not clear, and the results are not evaluated in a way that can help understand what are the actual benefits of this approach compated to alternatives (using or not semantic web technologies).
Metareview by Hala Skaf
While the paper describes an interesting application, the work itself is not quite novel, according to the reviewers. It reuses existing techniques but not in a way in which their combination or integration shows novelty. Also, the contribution is not sufficiently evaluated in comparison to similar approaches. The authors do not provide any rebuttal.