Table of Contents
Fetching ...

Approximating Probabilistic Inference in Statistical EL with Knowledge Graph Embeddings

Yuqicheng Zhu, Nico Potyka, Bo Xiong, Trung-Kien Tran, Mojtaba Nayyeri, Evgeny Kharlamov, Steffen Staab

TL;DR

The paper tackles the hardness of probabilistic reasoning in Statistical EL (SEL) by leveraging knowledge graph embeddings to produce scalable, approximate inference. It generalizes BoxEL-style embeddings to SEL, formalizes a normal form for SEL, and proves soundness guarantees alongside linear-time inference in the embedding space. Empirical results show low embedding and inference errors, with the approximation gap shrinking as more embeddings are used, and ablation studies identifying crucial regularization choices and an affine treatment for relations. The approach paves the way for efficient probabilistic reasoning over large ontologies by trading exactness for scalable, bounded-inference guarantees that remain practically reliable for many knowledge-graph applications.

Abstract

Statistical information is ubiquitous but drawing valid conclusions from it is prohibitively hard. We explain how knowledge graph embeddings can be used to approximate probabilistic inference efficiently using the example of Statistical EL (SEL), a statistical extension of the lightweight Description Logic EL. We provide proofs for runtime and soundness guarantees, and empirically evaluate the runtime and approximation quality of our approach.

Approximating Probabilistic Inference in Statistical EL with Knowledge Graph Embeddings

TL;DR

The paper tackles the hardness of probabilistic reasoning in Statistical EL (SEL) by leveraging knowledge graph embeddings to produce scalable, approximate inference. It generalizes BoxEL-style embeddings to SEL, formalizes a normal form for SEL, and proves soundness guarantees alongside linear-time inference in the embedding space. Empirical results show low embedding and inference errors, with the approximation gap shrinking as more embeddings are used, and ablation studies identifying crucial regularization choices and an affine treatment for relations. The approach paves the way for efficient probabilistic reasoning over large ontologies by trading exactness for scalable, bounded-inference guarantees that remain practically reliable for many knowledge-graph applications.

Abstract

Statistical information is ubiquitous but drawing valid conclusions from it is prohibitively hard. We explain how knowledge graph embeddings can be used to approximate probabilistic inference efficiently using the example of Statistical EL (SEL), a statistical extension of the lightweight Description Logic EL. We provide proofs for runtime and soundness guarantees, and empirically evaluate the runtime and approximation quality of our approach.
Paper Structure (23 sections, 10 theorems, 26 equations, 3 figures, 10 tables, 1 algorithm)

This paper contains 23 sections, 10 theorems, 26 equations, 3 figures, 10 tables, 1 algorithm.

Key Result

Lemma 1

For all $\mathcal{EL}$ interpretations $\mathcal{I}$, we have $\mathcal{I} \models C \sqsubseteq D$ iff $\mathcal{I} \models (D \mid C)[1]$.

Figures (3)

  • Figure 1: Two possible 2D embeddings of the concepts Student (green), Undergraduate Student (blue) and Computer Science Student (red) that maintain proportions stated in the knowledge base.
  • Figure 2: We conduct experiments on 5 different query sets and train 60 embeddings for each query set. This table shows the mean embedding error and mean inference error with respect to the number of epoches.
  • Figure 3: Approximation gap for approximation based on 10, 20, 40 and 60 embeddings.

Theorems & Definitions (17)

  • Example 1
  • Example 2
  • Example 3
  • Example 4
  • Lemma 1: penaloza2017towards
  • Definition 1: $\mathcal{SEL}$ normal form
  • Proposition 1
  • Definition 2
  • Definition 3: Geometric $\mathcal{SEL}$ Interpretation
  • Lemma 2
  • ...and 7 more