NSF Award: III: Small: Enhancing Ontology Matching with Visual Analytics

PI: Maria Isabel Cruz

Award Number: 1618126

Abstract

An ontology is a representation of a domain, be it biomedical, business, environmental, or others. In a world where data are predominantly heterogeneous, ontology matching establishes correspondences between the concepts of two ontologies, thus effectively bridging across two distinct domains or two different representations of the same domain. Ontology matching is therefore a fundamental tool for data integration, that is, for the creation of a homogeneous gateway to disparate data. Ontology matching systems are made up of several matching algorithms, called matchers. Different matching tasks require different matchers, thus there are various configuration and parameter choices to be made, resulting in a multi-dimensional problem whose tuning requires considerable effort and expertise. However, most ontology matching systems operate as a black box offering no insight as to how the output---a set of mappings among ontology concepts, called alignment---is generated. These systems do not usually offer the opportunity to the domain experts to validate automatically generated mappings, so as to gain control over the matching process. In this project, we use visual analytics, a combination of visualization and analytics to facilitate ontology matching. Users interact with a visual representation of the matching process and validate mappings that are ranked by underlying analytical methods. This award investigates visual analytics methods and studies their potential benefits. Our collaboration with partners in the biological domain will ensure the practical relevance of our research. From an educational viewpoint, the PI is spearheading a new Data Science curriculum in Computer Science, which can incorporate the main aspects of the proposed research and will train a graduate student and postdoc in this multidisciplinary field.

Driven by data integration needs in a wide range of domains, the field of ontology matching has been prospering. However, the use of visual analytics remains largely unexplored. The proposed research will combine the power of visual analytics with ontology matching to: (1) open up the ontology matching process so as to facilitate its configuration by domain experts; (2) reduce the number of mappings to be validated by the experts so as to achieve high quality results with minimum effort; and (3) investigate a methodology to evaluate the benefits of combining ontology matching with visual analytics. For the visualization design, a principled approach will be followed that provides prescriptive guidance for determining appropriate evaluation approaches, while for the manipulation of visualized data a taxonomy of interactive operations will be used. Further, the design and analysis of the workflow that describes the interactive nature of the overall process will facilitate the study of the complex interdependencies between the data manipulation and the visualization components. The web site of this project is available at https://www.cs.uic.edu/Cruz/OntologyMatchingVisualAnalytics.

First Year Report (2016-2017)
Major goals

The proposed research will combine the power of visual analytics with ontology matching to: (1) open up the ontology matching process so as to facilitate its configuration by domain experts; (2) reduce the number of mappings to be validated by the experts so as to achieve high quality results with minimum effort; and (3) investigate a methodology to evaluate the benefits of combining ontology matching with visual analytics.

Accomplishments

Ontology matching often uses external knowledge bases to establish the connection among concepts in different ontologies. We have investigated a distantly supervised approach to derive spatio-temporal relationships from text documents, which enrich knowledge bases with dynamic facts. Specifically, we have analyzed corpora of noun phrases, appositions, and adjectives to build templates characterizing geospatial and temporal data. We have successfully evaluated the effectiveness of our approach using both automated and manual methods on the YAGO knowledge base.

We have extended the ontology matching system AgreementMakerLight (in collaboration with researchers at the U. of Lisbon and IGC) to incorporate more scalable matching methods. The quality of our matching results have been recognized by our placement at the top of the more than 20 systems that competed in the 2016 Ontology Alignment Evaluation Initiative (OAEI), receiving the Pistoia Alliance first prize for the Disease and Phenotype track of the OAEI. We have disseminated our work by contributing open source code to GitHub. AgreementMakerLight has been used by experts to match ontologies describing biological information networks. They have been using a visual interface for the AgreementMakerLight system to explore in detail a subset of the mappings to determine their correctness. The experts have identified some of the visual operations they perform and their correspondence to their verification methods.

We have used Association Rule Mining and ontologies to extract patterns from repositories of crime data. Our technique has been incorporated into a system that displays query results and supports their analysis via a geospatial user interface. To analyze a Chicago crime dataset, we have built a crime ontology by matching crime classification schemes from the FBI and the Chicago Police Department. Our experiments show that we can significantly reduce the number of rules without sacrificing query precision.

We have investigated two formal frameworks to describe the requirements of a system that performs ontology matching extended with visual interaction, namely the Nested Model and the NOVIS model.

Products
  • Balasubramani, Booma Sowkarthiga ; Shivaprabhu, Vivek R. ; Krishnamurthy, Smitha ; Cruz, Isabel F. ; Malik, Tanu (2016). Ontology-based urban data exploration. 4th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. . Status = PUBLISHED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes ; DOI: 10.1145/3007540.3007550

  • Cheatham, Michelle; Cruz, I. F. ; Euzenat, Jerôme ; Pesquita, Cátia (2017). Special issue on ontology and linked data matching.. Semantic Web. 8 (2), . Status = PUBLISHED; Acknowledgment of Federal Support = No ; DOI: 10.3233/SW-160251

  • Mirrezaei, Seyed Iman ; Martins, Bruno ; Cruz, Isabel F. (2016). A distantly supervised method for extracting spatio-temporal information from text. 4th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. . Status = PUBLISHED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes ; DOI: 10.1145/2996913.2996967

  • Faria, Daniel ; Pesquita, Catia ; Balasubramani, Booma S. ; Martins, Catarina ; Cardoso, João ; Curado, Hugo ; Couto, Francisco M. ; Cruz, Isabel F. (2016). Ontology-based urban data exploration. 11th International Workshop on Ontology Matching co-located with the 15th International Semantic Web Conference. . Status = PUBLISHED; Acknowledgement of Federal Support = Yes
Students

Vivek Shivaprabhu

Plans for the following year

We plan to continue the identification of the matching and visual operations using the Nested Model.

We also plan to continue extending the AgreementMakerLight system to incorporate our research findings.

Regarding broader impacts, we plan to give assignments in an introductory Data Science course that address data integration and visual methods for data analysis.

Second Year Report (2017-2018)
Major goals

Products

Students

Vivek Shivaprabhu
Zhu (Ellen) Wang

Plans for the following year
Edit | Attach | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r6 - 2018-07-29 - 20:44:01 - Main.ifcruz
 
Copyright 2016 The Board of Trustees
of the University of Illinois.webmaster@cs.uic.edu
WISEST
Helping Women Faculty Advance
Funded by NSF