Table of Contents
Fetching ...

LLMs4OM: Matching Ontologies with Large Language Models

Hamed Babaei Giglou, Jennifer D'Souza, Felix Engel, Sören Auer

TL;DR

This paper introduces LLMs4OM, a retrieval-augmented framework for ontology matching that combines three concept representations ($C$, $CP$, $CC$) with retrieval models and zero-shot LLM prompting to determine equivalence between source and target concepts. The authors evaluate seven LLMs across four retrievers on 20 OM tasks drawn from five OAEI tracks, revealing that LLMs can match or surpass traditional OM systems in many complex scenarios, especially when structural context is provided. A four-step pipeline—concept representation, retrieval, LLM scoring, and post-processing—reduces computational complexity and mitigates hallucination risks compared with naive full-ontology prompting. Key findings include the superiority of the $C$ representation for retrieval, the effectiveness of ada-based retrievers, and strong per-track LLM performance (notably GPT-3.5 and Mistral-7B), although Bio-ML remains challenging and context-aware prompts are essential. The work demonstrates the practical potential of LLMs for OM and provides an open-source framework for further exploration in cross-ontology alignment and knowledge integration.

Abstract

Ontology Matching (OM), is a critical task in knowledge integration, where aligning heterogeneous ontologies facilitates data interoperability and knowledge sharing. Traditional OM systems often rely on expert knowledge or predictive models, with limited exploration of the potential of Large Language Models (LLMs). We present the LLMs4OM framework, a novel approach to evaluate the effectiveness of LLMs in OM tasks. This framework utilizes two modules for retrieval and matching, respectively, enhanced by zero-shot prompting across three ontology representations: concept, concept-parent, and concept-children. Through comprehensive evaluations using 20 OM datasets from various domains, we demonstrate that LLMs, under the LLMs4OM framework, can match and even surpass the performance of traditional OM systems, particularly in complex matching scenarios. Our results highlight the potential of LLMs to significantly contribute to the field of OM.

LLMs4OM: Matching Ontologies with Large Language Models

TL;DR

This paper introduces LLMs4OM, a retrieval-augmented framework for ontology matching that combines three concept representations (, , ) with retrieval models and zero-shot LLM prompting to determine equivalence between source and target concepts. The authors evaluate seven LLMs across four retrievers on 20 OM tasks drawn from five OAEI tracks, revealing that LLMs can match or surpass traditional OM systems in many complex scenarios, especially when structural context is provided. A four-step pipeline—concept representation, retrieval, LLM scoring, and post-processing—reduces computational complexity and mitigates hallucination risks compared with naive full-ontology prompting. Key findings include the superiority of the representation for retrieval, the effectiveness of ada-based retrievers, and strong per-track LLM performance (notably GPT-3.5 and Mistral-7B), although Bio-ML remains challenging and context-aware prompts are essential. The work demonstrates the practical potential of LLMs for OM and provides an open-source framework for further exploration in cross-ontology alignment and knowledge integration.

Abstract

Ontology Matching (OM), is a critical task in knowledge integration, where aligning heterogeneous ontologies facilitates data interoperability and knowledge sharing. Traditional OM systems often rely on expert knowledge or predictive models, with limited exploration of the potential of Large Language Models (LLMs). We present the LLMs4OM framework, a novel approach to evaluate the effectiveness of LLMs in OM tasks. This framework utilizes two modules for retrieval and matching, respectively, enhanced by zero-shot prompting across three ontology representations: concept, concept-parent, and concept-children. Through comprehensive evaluations using 20 OM datasets from various domains, we demonstrate that LLMs, under the LLMs4OM framework, can match and even surpass the performance of traditional OM systems, particularly in complex matching scenarios. Our results highlight the potential of LLMs to significantly contribute to the field of OM.
Paper Structure (9 sections, 2 figures, 1 table)

This paper contains 9 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Overview on LLMs4OM as an end-to-end framework for OM.
  • Figure 2: Comparing retrieval models using recall and $top_k=5$.