Table of Contents
Fetching ...

OntoAligner Meets Knowledge Graph Embedding Aligners

Hamed Babaei Giglou, Jennifer D'Souza, Sören Auer, Mahsa Sanaei

TL;DR

This work investigates using Knowledge Graph Embeddings (KGEs) for Ontology Alignment (OA) by reframing OA as a link-prediction task over merged ontologies. It introduces GraphEmbeddingAligner, a modular component within OntoAligner, leveraging 17 KGE models to learn cross-ontology representations from a unified triple factory and perform alignments via cosine similarity, with one-to-one and threshold-based post-processing. Across seven OA benchmark tasks spanning five domains, ConvE and TransF emerge as strong performers, delivering high-precision alignments, while recall remains moderate, indicating a complementary role to context-rich LLM approaches. The results demonstrate the practicality and scalability of embedding-based OA, suggesting future work on adaptive thresholding, hybrid LLM-KGE models, and domain-specific enhancements to further improve accuracy in complex settings.

Abstract

Ontology Alignment (OA) is essential for enabling semantic interoperability across heterogeneous knowledge systems. While recent advances have focused on large language models (LLMs) for capturing contextual semantics, this work revisits the underexplored potential of Knowledge Graph Embedding (KGE) models, which offer scalable, structure-aware representations well-suited to ontology-based tasks. Despite their effectiveness in link prediction, KGE methods remain underutilized in OA, with most prior work focusing narrowly on a few models. To address this gap, we reformulate OA as a link prediction problem over merged ontologies represented as RDF-style triples and develop a modular framework, integrated into the OntoAligner library, that supports 17 diverse KGE models. The system learns embeddings from a combined ontology and aligns entities by computing cosine similarity between their representations. We evaluate our approach using standard metrics across seven benchmark datasets spanning five domains: Anatomy, Biodiversity, Circular Economy, Material Science and Engineering, and Biomedical Machine Learning. Two key findings emerge: first, KGE models like ConvE and TransF consistently produce high-precision alignments, outperforming traditional systems in structure-rich and multi-relational domains; second, while their recall is moderate, this conservatism makes KGEs well-suited for scenarios demanding high-confidence mappings. Unlike LLM-based methods that excel at contextual reasoning, KGEs directly preserve and exploit ontology structure, offering a complementary and computationally efficient strategy. These results highlight the promise of embedding-based OA and open pathways for further work on hybrid models and adaptive strategies.

OntoAligner Meets Knowledge Graph Embedding Aligners

TL;DR

This work investigates using Knowledge Graph Embeddings (KGEs) for Ontology Alignment (OA) by reframing OA as a link-prediction task over merged ontologies. It introduces GraphEmbeddingAligner, a modular component within OntoAligner, leveraging 17 KGE models to learn cross-ontology representations from a unified triple factory and perform alignments via cosine similarity, with one-to-one and threshold-based post-processing. Across seven OA benchmark tasks spanning five domains, ConvE and TransF emerge as strong performers, delivering high-precision alignments, while recall remains moderate, indicating a complementary role to context-rich LLM approaches. The results demonstrate the practicality and scalability of embedding-based OA, suggesting future work on adaptive thresholding, hybrid LLM-KGE models, and domain-specific enhancements to further improve accuracy in complex settings.

Abstract

Ontology Alignment (OA) is essential for enabling semantic interoperability across heterogeneous knowledge systems. While recent advances have focused on large language models (LLMs) for capturing contextual semantics, this work revisits the underexplored potential of Knowledge Graph Embedding (KGE) models, which offer scalable, structure-aware representations well-suited to ontology-based tasks. Despite their effectiveness in link prediction, KGE methods remain underutilized in OA, with most prior work focusing narrowly on a few models. To address this gap, we reformulate OA as a link prediction problem over merged ontologies represented as RDF-style triples and develop a modular framework, integrated into the OntoAligner library, that supports 17 diverse KGE models. The system learns embeddings from a combined ontology and aligns entities by computing cosine similarity between their representations. We evaluate our approach using standard metrics across seven benchmark datasets spanning five domains: Anatomy, Biodiversity, Circular Economy, Material Science and Engineering, and Biomedical Machine Learning. Two key findings emerge: first, KGE models like ConvE and TransF consistently produce high-precision alignments, outperforming traditional systems in structure-rich and multi-relational domains; second, while their recall is moderate, this conservatism makes KGEs well-suited for scenarios demanding high-confidence mappings. Unlike LLM-based methods that excel at contextual reasoning, KGEs directly preserve and exploit ontology structure, offering a complementary and computationally efficient strategy. These results highlight the promise of embedding-based OA and open pathways for further work on hybrid models and adaptive strategies.

Paper Structure

This paper contains 20 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: The architecture of the proposed framework, comprising three main stages: Parser, Encoder, and Aligner. The model ingests source and target ontologies, encodes them as unified triples, learns low-dimensional embeddings via a KGE model (trained in PyKEEN), and finally computes alignment predictions through cosine similarity.
  • Figure 2: Column 1 and 2: Precision and recall analysis across aligners and tasks. Column 3: Representation learning and inference time response analysis.
  • Figure 3: Scatter plot showing the average CPU utilization (%) against memory consumption (MB) for each KGE aligner. Each color represents a distinct model. This visualization highlights the relative computational efficiency and resource demands of the models.