Table of Contents
Fetching ...

Uncertainty in Automated Ontology Matching: Lessons Learned from an Empirical Experimentation

Inès Osman, Salvatore F. Pileggi, Sadok Ben Yahia

TL;DR

The paper investigates uncertainty in automated ontology matching within a practical knowledge-building workflow that converts relational data to ontologies, matches them, and integrates results into a knowledge graph. It provides a detailed analysis of uncertainty sources, evaluates several tools (LogMap and AML) on a spatio-temporal case study, and demonstrates that fully automated matching yields significant uncertainty despite high precision in some cases. The study shows that alignment trimming and disambiguation can reduce uncertainty but often at the cost of missing correct correspondences, highlighting the need for semi-supervised or human-in-the-loop approaches. Practically, the work underscores that reliable automated ontology matching alone is insufficient for large-scale data integration, and it advocates semi-supervised strategies and new uncertainty metrics to improve real-world interoperability and data reuse.

Abstract

Data integration is considered a classic research field and a pressing need within the information science community. Ontologies play a critical role in such a process by providing well-consolidated support to link and semantically integrate datasets via interoperability. This paper approaches data integration from an application perspective, looking at techniques based on ontology matching. An ontology-based process may only be considered adequate by assuming manual matching of different sources of information. However, since the approach becomes unrealistic once the system scales up, automation of the matching process becomes a compelling need. Therefore, we have conducted experiments on actual data with the support of existing tools for automatic ontology matching from the scientific community. Even considering a relatively simple case study (i.e., the spatio-temporal alignment of global indicators), outcomes clearly show significant uncertainty resulting from errors and inaccuracies along the automated matching process. More concretely, this paper aims to test on real-world data a bottom-up knowledge-building approach, discuss the lessons learned from the experimental results of the case study, and draw conclusions about uncertainty and uncertainty management in an automated ontology matching process. While the most common evaluation metrics clearly demonstrate the unreliability of fully automated matching solutions, properly designed semi-supervised approaches seem to be mature for a more generalized application.

Uncertainty in Automated Ontology Matching: Lessons Learned from an Empirical Experimentation

TL;DR

The paper investigates uncertainty in automated ontology matching within a practical knowledge-building workflow that converts relational data to ontologies, matches them, and integrates results into a knowledge graph. It provides a detailed analysis of uncertainty sources, evaluates several tools (LogMap and AML) on a spatio-temporal case study, and demonstrates that fully automated matching yields significant uncertainty despite high precision in some cases. The study shows that alignment trimming and disambiguation can reduce uncertainty but often at the cost of missing correct correspondences, highlighting the need for semi-supervised or human-in-the-loop approaches. Practically, the work underscores that reliable automated ontology matching alone is insufficient for large-scale data integration, and it advocates semi-supervised strategies and new uncertainty metrics to improve real-world interoperability and data reuse.

Abstract

Data integration is considered a classic research field and a pressing need within the information science community. Ontologies play a critical role in such a process by providing well-consolidated support to link and semantically integrate datasets via interoperability. This paper approaches data integration from an application perspective, looking at techniques based on ontology matching. An ontology-based process may only be considered adequate by assuming manual matching of different sources of information. However, since the approach becomes unrealistic once the system scales up, automation of the matching process becomes a compelling need. Therefore, we have conducted experiments on actual data with the support of existing tools for automatic ontology matching from the scientific community. Even considering a relatively simple case study (i.e., the spatio-temporal alignment of global indicators), outcomes clearly show significant uncertainty resulting from errors and inaccuracies along the automated matching process. More concretely, this paper aims to test on real-world data a bottom-up knowledge-building approach, discuss the lessons learned from the experimental results of the case study, and draw conclusions about uncertainty and uncertainty management in an automated ontology matching process. While the most common evaluation metrics clearly demonstrate the unreliability of fully automated matching solutions, properly designed semi-supervised approaches seem to be mature for a more generalized application.
Paper Structure (34 sections, 2 figures, 12 tables)

This paper contains 34 sections, 2 figures, 12 tables.

Figures (2)

  • Figure 1: Knowledge-Building Process Driven by Data Integration.
  • Figure 2: Alignment Disambiguation (A Simplified Approach for the Stable Marriage gale1962college)

Theorems & Definitions (5)

  • Example 1: Ambiguity Case
  • Example 2: Ambiguity Case
  • Example 3: Ambiguity Case
  • Example 4: Semantic Matching Case
  • Example 5: Abbreviations and Acronyms Case