Use of Social Media and ontologies on identifying potential witness for criminal cases
Author(s): Brigite Peposhi, Edlira Kalemi, Sule Yildirim
Full text: submitted version
Abstract: The big expansion of social media and the wide usage of them from intelligence agencies to investigate and solve crimes lead us to find a new perspective on the way of aggregating and processing information. In this paper we contribute on the efficiency of tools to investigate the crimes and we achieve it by using match-making and data-mining techniques. We have built a tool which will help the intelligence agencies to identify potential witnesses from social media by finding social media users that have checked-in near the same location where the crime has happened. The tool we designed and developed makes two searches: 1) searching the knowledge base (the crime ontology), where crimes are registered, so that we can search for the crimes that all occurred at a specific location, and finding the relations among these crimes; 2) searching on social media, where by using the data-mining techniques, we may find potential witness for the crime case. The tool built in this paper serves as a base for further work on the use of semantic web and ontologies on preventing and solving criminal cases.
Keywords: social media; crime solving; ontology; match-making
Review 1 (by anonymous reviewer)
(RELEVANCE TO ESWC) There is very little relevance to ESWC or the Semantic Web. It's not clear how the ontology is used or what purpose it serves. (NOVELTY OF THE PROPOSED SOLUTION) Since the details are so unclear about the method, and especially the use of the ontology, the novelty is also completely unclear. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The details are so sketchy that one cannot even begin to comment on the correctness. The method is not evaluated or compared so one cannot tell if it even works. (EVALUATION OF THE STATE-OF-THE-ART) There is almost no understanding shown of the background or state of the art. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) It is totally unclear how the approach compares with other approaches. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) It is totally unclear what was done. (OVERALL SCORE) Summary: The paper describes a method to identify potential witnesses for crimes based on their Facebook location. However, the details of the method are so sketchy that it is hard to understand *how* the method works. Also, since it only uses mockup data, there are no real conclusions or results that can be drawn. Strong points: The topic of this work is potentially interesting, If the method were applied to an open source of data such as Twitter (though here there are still problems since very few tweets are actually geolocated) and it could be proven that it actually works and helps the police, then it could be very interesting. Weak points: Unfortunately there is very little content in the paper itself. Although 15 pages are allowed, the authors use only about 6 1/2 for actual content. The main problems with the work are the following: 1. there is no detail at all about the method. It's completely unclear what was actually done, what the ontology was used for, and so on. 2. The work is not situated in the field - there is no description of related work or demonstration of knowledge about the area. 3. It is not clear what benefit the ontology and semantic web side of the work actually provides. There is no description of the ontology and no discussion of the suitability of the ontology. 4. There is no evaluation. There is no evidence of how the semantic aspect helps the police go beyond what they already have. How does using the ontology benefit over just searching their existing criminal databases? 5. The data with Facebook is totally made up, so there is no evidence that it would work on real data. Besides, since most Facebook posts are not public, the necessary information would never be available to the system. 6. It's entirely unclear if there even is sufficient data on Facebook to help with the problem (though one could assume that there might be - however, it needs to be proven in order for the system to be useful). A few more specific points: - In the Introduction, it's not clear what is menat by "process the information from a new perspective". What perspective? - The point about filtering information from Facebook requiring specifically techniques such as Euclidean Distance is rather narrow and shows a lack of broader understanding (or more precise specification). There are many ways in which one could filter relevant information from Facebook. - There are no details given about "manipulation of the ontology". What does this even mean? - What does it mean to "register a case with the ontology"? How is this done? - In the Methodology section, the point about Agile development is rather irrelevant. No need to mention this and include the picture of Agile development, it adds nothing useful. - In the methodology section, there are no details of what the tool does. In summary, this work is very preliminary, lacks description and content, and has no evaluation. Plus it seems totally infeasible as it stands. However, if the method were applied to an open source of data such as Twitter (though here there are still problems since very few tweets are actually geolocated) and it could be proven that it actually works and helps the police, then it could be very interesting.
Review 2 (by Daniel Garijo)
(RELEVANCE TO ESWC) The paper is relevant to the ESWC conference, but not to the research track, as it introduces an application (NOVELTY OF THE PROPOSED SOLUTION) The application is a novel use of semantic web technologies. As far as I know, no other application aims to detect witnesses based on semantic metadata relatedness. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) The application is not evaluated. No real research contribution is described. (EVALUATION OF THE STATE-OF-THE-ART) No state of the art is provided (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The application is not evaluated against users or other systems. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The application doesn't seem to be available. (OVERALL SCORE) This paper describes an approach for detecting potential witnesses of crime scenes based on their position and time to the crime event. The paper reads well, and the underlying idea pursued by the authors is an original use of semantic web technologies. However, in its current stage the paper is not ready to be accepted as part of ESWC. I list briefly my main concerns below. - The paper presents a tool, but it is on the research track. This work does not describe any contribution in terms of research. - The paper has no evaluation of the tool. - There is no related work section. - The tool itself and the ontology described do not seem to be available for review. - The authors don't motivate the use case with any real world scenario that could potentially consume the described tool
Review 3 (by Armando Stellato)
(RELEVANCE TO ESWC) A new ontology is surely relevant to the track, but the paper lacks of focus, is this the ontology they are presenting, some used techniques, or an application for it? In 9 scarce pages (lot of white space) all of these aspects have been vaguely touched (NOVELTY OF THE PROPOSED SOLUTION) No novelty, crime-finding by reasoning was Semantic Web advertisement examples (even a little naïve) in the early 2000… and the author do not add anything relevant to the topic. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) No detailed description of the ontology (considering the track where it has been submitted), no experimentation, useless technical details (EVALUATION OF THE STATE-OF-THE-ART) Lack of references on: 1) claims by the authors 2) existing theories, approaches, paradigms etc… (e.g. Agile method in section 3) 3) technologies (OWLAPI, Hermit) P.S. where is section 2? (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) No detailed description of the ontology (considering the track where it has been submitted), no experimentation, useless technical details (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) Nothing to reproduce. I would say this section does not apply, but the authors even failed in reporting basic information. (OVERALL SCORE) The authors introduce a tool that is claimed will help intelligence agencies to identify potential witnesses from social media by finding social media users that have checked-in near the same location of crime-scenes when the crime happened. The paper does not seem to be based on solid scientific basis. Expression such as “To filter the large information provided from Facebook in an efficient way it is necessary to apply techniques such as Euclidean distance and Cosine Similarity which are widely used data-mining approach .” Do not reflect any finding nor cite any source that proves that those techniques are necessary (not to speak about the vagueness of the statement) Also, yet from the abstract: “We have built a tool which will help the intelligence agencies to identify potential witnesses from social media by finding social media users that have checked-in near the same location where the crime has happened.” combined with “searching the knowledge base (the crime ontology)” (which presumably is not a real repository but an idea of the authors) results in a big claim (“will help the intelligence”) with no solid foundation on any real project. There is a non-relevant specification of technologies and technical details for a paper about an ontology, and these are not properly justified. 1. Why OWL API? Do OWL API properly scale on the quantity of data that is being managed? They can be optimal for managing OWL ontologies (intended as scheme only), not for large quantities of data, unless these are used as clients for some triple stores. Same consideration hold for Hermit. 2. The details in the “Experiments and Results” section are not meaningful. The users describe the prompting of data as if they writing a OWL editor manual. Is the one they mention an application of them or an editor? In the first case, why providing so many details such as the class tree of the whole ontology in the application? That is counter-intuitive for a user. In the latter case, then no need to show how to use a normal ontology editor which is not in the scope of the paper. 3. the code for using Hermit on OWLAPI is really non-relevant There is actually no experimentation, only some commented screenshots and little more information on what the system promises to do. As is, it is little more than an exercise of use of Semantic Web technologies, even lacking an evaluation of the appropriateness of these technologies. **** Weak Points **** * no focus on a specific topic (ontology, application, experiment?) * shallow description of everything, and lack of space is not an excuse * no supported claims
Review 4 (by anonymous reviewer)
(RELEVANCE TO ESWC) The topic as such is interesting but the relation to Semantic Web technologies is not well worked out. (NOVELTY OF THE PROPOSED SOLUTION) This is impossible to judge as it remains unclear what the role of SW technologies actually is. (CORRECTNESS AND COMPLETENESS OF THE PROPOSED SOLUTION) Does not apply. (EVALUATION OF THE STATE-OF-THE-ART) As far as I can tell, this tool falls substantially behind other crime analysis tools. (DEMONSTRATION AND DISCUSSION OF THE PROPERTIES OF THE PROPOSED APPROACH) The paper lacks the details to clearly answer this question. (REPRODUCIBILITY AND GENERALITY OF THE EXPERIMENTAL STUDY) The discussed experiments cannot be reproduced. (OVERALL SCORE) The paper entitled 'Use of social media and ontologies on identifying potential witness for criminal cases' presents a tool to assist intelligence agencies in crime analysis by using social media data such as check-ins near crime spots. The content and ideas as such are interesting and could be relevant for ESWC 2018. However, all this is overshadowed by major language issues and structural deficiencies. I can only assume that the authors were running out of time to polish their work, but, as is, the paper is barely readable, largely unformatted, and unstructured. Finally, almost all references are malformed. I find the combination of social media, semantic web technologies, and crime analytics interesting, but it is not clear to me what exactly the role of ontologies and reasoning is in this particular work. For instance, a look-up of people that checked-in to locations near a crime scene at approximately the same time can be done without any use of ontologies or reasoning. As far as I can tell, the examples shown boil down to simple comparisons using cosine similarity and other well-known methods. There is nothing in the experiments section that would enable a reviewer to evaluate the quality and scope of the presented work. Another related problem is the level of detail. The figure that illustrates AGILE software development does not contribute anything relevant for the ESWC audience as does the UML use case diagram, especially given that it only shows one actor and no relationship between the use cases/actions. The same can be said about the code snippets that do not contribute to the understanding of the present work. On a minor note: geographic coordinates should be reported based on the precision offered by the sensing device (significant digits). Reporting a crime sport down to the level of atoms is not considered good practice. For a revision, I would suggest that the authors work out their contribution in detail, explain why their tool and use cases would benefit from semantic technologies, introduce a collection of examples to make their case, and provide an evaluation schema to help readers in judging the strength and weakness of their work. I would suggest dropping engineering details such as the used reasoner, software development approach, Java code snippets and so forth.
Metareview by Hsofia Pinto
The paper presents a tool which will help to identify potential witnesses by finding social media users that have checked-in near the same location where a crime has happened. The approach and tool are not described with enough detail, and no evaluation is provided. Due to these serious problems, the paper cannot be recommended. However, authors are strongly encouraged to work on the comments provided by the reviewers to improve their work.