Table of Contents
Fetching ...

Enhancing Geometric Ontology Embeddings for $\mathcal{EL}^{++}$ with Negative Sampling and Deductive Closure Filtering

Olga Mashkova, Fernando Zhapa-Camacho, Robert Hoehndorf

TL;DR

The paper addresses limitations in EL++ geometric embeddings that conflate unprovable with provably false statements and overlook deductive closure. It introduces deductive-closure–aware negative sampling and new negative losses across multiple normal forms, together with an approximate deductive closure algorithm to guide training. Using ELEmbeddings as a baseline, the authors demonstrate improvements in knowledge base completion on GO/STRING datasets for yeast and human proteins, while revealing dataset biases and the differing emphases of entailment versus novel axiom prediction among models. The findings highlight the importance of entailment-aware evaluation and sampling, with implications for broader geometric ontology embeddings and potential extensions to more expressive models such as Box2EL.”

Abstract

Ontology embeddings map classes, relations, and individuals in ontologies into $\mathbb{R}^n$, and within $\mathbb{R}^n$ similarity between entities can be computed or new axioms inferred. For ontologies in the Description Logic $\mathcal{EL}^{++}$, several embedding methods have been developed that explicitly generate models of an ontology. However, these methods suffer from some limitations; they do not distinguish between statements that are unprovable and provably false, and therefore they may use entailed statements as negatives. Furthermore, they do not utilize the deductive closure of an ontology to identify statements that are inferred but not asserted. We evaluated a set of embedding methods for $\mathcal{EL}^{++}$ ontologies based on high-dimensional ball representation of concept descriptions, incorporating several modifications that aim to make use of the ontology deductive closure. In particular, we designed novel negative losses that account both for the deductive closure and different types of negatives. We demonstrate that our embedding methods improve over the baseline ontology embedding in the task of knowledge base or ontology completion.

Enhancing Geometric Ontology Embeddings for $\mathcal{EL}^{++}$ with Negative Sampling and Deductive Closure Filtering

TL;DR

The paper addresses limitations in EL++ geometric embeddings that conflate unprovable with provably false statements and overlook deductive closure. It introduces deductive-closure–aware negative sampling and new negative losses across multiple normal forms, together with an approximate deductive closure algorithm to guide training. Using ELEmbeddings as a baseline, the authors demonstrate improvements in knowledge base completion on GO/STRING datasets for yeast and human proteins, while revealing dataset biases and the differing emphases of entailment versus novel axiom prediction among models. The findings highlight the importance of entailment-aware evaluation and sampling, with implications for broader geometric ontology embeddings and potential extensions to more expressive models such as Box2EL.”

Abstract

Ontology embeddings map classes, relations, and individuals in ontologies into , and within similarity between entities can be computed or new axioms inferred. For ontologies in the Description Logic , several embedding methods have been developed that explicitly generate models of an ontology. However, these methods suffer from some limitations; they do not distinguish between statements that are unprovable and provably false, and therefore they may use entailed statements as negatives. Furthermore, they do not utilize the deductive closure of an ontology to identify statements that are inferred but not asserted. We evaluated a set of embedding methods for ontologies based on high-dimensional ball representation of concept descriptions, incorporating several modifications that aim to make use of the ontology deductive closure. In particular, we designed novel negative losses that account both for the deductive closure and different types of negatives. We demonstrate that our embedding methods improve over the baseline ontology embedding in the task of knowledge base or ontology completion.
Paper Structure (27 sections, 9 equations, 14 figures, 11 tables, 1 algorithm)

This paper contains 27 sections, 9 equations, 14 figures, 11 tables, 1 algorithm.

Figures (14)

  • Figure 1: ROC curves, Yeast iw dataset
  • Figure 2: ROC curves, Yeast hf dataset
  • Figure 3: ROC curves, Human iw dataset
  • Figure 4: ROC curves, Human hf dataset
  • Figure 5: ROC curves, Yeast iw dataset
  • ...and 9 more figures