Agent-OM: Leveraging LLM Agents for Ontology Matching
Zhangcheng Qiang, Weiqing Wang, Kerry Taylor
TL;DR
Agent-OM presents a novel agent-powered LLM framework for ontology matching, featuring two Siamese LLM agents (Retrieval and Matching) that interact via a memory-based hybrid database and a suite of tools. The approach decomposes OM into retrieval and matching phases, uses planning, tool calls, and memory to mitigate LLM limitations, and employs Reciprocal Rank Fusion, a Matching Validator, and a Merging step to improve precision and F1. Evaluations across three OAEI tracks show Agent-OM nearly matches best results on simple tasks and yields significant gains on complex and few-shot tasks, with ablations validating the value of planning, tool usage, and memory. The work demonstrates the practicality and scalability of LLM-agent-based OM, while candidly addressing limitations (ABox, cost, and model access) and outlining concrete paths for multimodal, multilingual, and efficiency-focused future work.
Abstract
Ontology matching (OM) enables semantic interoperability between different ontologies and resolves their conceptual heterogeneity by aligning related entities. OM systems currently have two prevailing design paradigms: conventional knowledge-based expert systems and newer machine learning-based predictive systems. While large language models (LLMs) and LLM agents have revolutionised data engineering and have been applied creatively in many domains, their potential for OM remains underexplored. This study introduces a novel agent-powered LLM-based design paradigm for OM systems. With consideration of several specific challenges in leveraging LLM agents for OM, we propose a generic framework, namely Agent-OM (Agent for Ontology Matching), consisting of two Siamese agents for retrieval and matching, with a set of OM tools. Our framework is implemented in a proof-of-concept system. Evaluations of three Ontology Alignment Evaluation Initiative (OAEI) tracks over state-of-the-art OM systems show that our system can achieve results very close to the long-standing best performance on simple OM tasks and can significantly improve the performance on complex and few-shot OM tasks.
