GenOM: Ontology Matching with Description Generation and Large Language Model
Yiping Song, Jiaoyan Chen, Renate A. Schmidt
TL;DR
GenOM presents an LLM-enhanced ontology matching framework that semantically enriches concepts via generated definitions, uses embedding-based candidate retrieval, and applies LLM-driven binary judgments to determine equivalence, followed by post-processing with exact matching. Evaluated on five biomedical ontologies from the OAEI Bio-ML track, GenOM demonstrates competitive performance, with larger models like Qwen32B delivering stronger, more threshold-robust results and ablations confirming the value of semantic enrichment and few-shot prompting. The work contributes: (1) a modular GenOM pipeline, (2) a novel evaluation framework for LLM-generated definitions, (3) a cross-model analysis of LLM scales, and (4) evidence that semantic enrichment improves both candidate retrieval and final judgments, outperforming many traditional and recent LLM-based baselines. The findings suggest that LLM-driven semantic enrichment can significantly enhance biomedical ontology alignment and offer practical robustness for real-world semantic interoperability, while outlining directions for dynamic thresholds and expanded alignment types in future work.
Abstract
Ontology matching (OM) plays an essential role in enabling semantic interoperability and integration across heterogeneous knowledge sources, particularly in the biomedical domain which contains numerous complex concepts related to diseases and pharmaceuticals. This paper introduces GenOM, a large language model (LLM)-based ontology alignment framework, which enriches the semantic representations of ontology concepts via generating textual definitions, retrieves alignment candidates with an embedding model, and incorporates exact matching-based tools to improve precision. Extensive experiments conducted on the OAEI Bio-ML track demonstrate that GenOM can often achieve competitive performance, surpassing many baselines including traditional OM systems and recent LLM-based methods. Further ablation studies confirm the effectiveness of semantic enrichment and few-shot prompting, highlighting the framework's robustness and adaptability.
