Feature Propagation on Knowledge Graphs using Cellular Sheaves

John Cobb; Thomas Gebhart

Feature Propagation on Knowledge Graphs using Cellular Sheaves

John Cobb, Thomas Gebhart

TL;DR

The paper proposes a theoretically grounded approach for inductive knowledge graph reasoning by modeling KG embeddings as global sections of a cellular sheaf and propagating known embeddings to unseen nodes via a sheaf Laplacian-driven diffusion. It derives a closed-form harmonic extension for the optimal interior embeddings and provides a convergent Euler-based iterative scheme with explicit rate guarantees, enabling scalable inference without retraining. Empirically, the method yields competitive or superior performance on semi-inductive logical query reasoning and inductive KG completion across large benchmarks, sometimes matching or exceeding specialized inductive models. The approach offers a simple, interpretable, and robust baseline for extending transductive KG embeddings to inductive settings, with strong theoretical guarantees and practical scalability.

Abstract

Many inference tasks on knowledge graphs, including relation prediction, operate on knowledge graph embeddings -- vector representations of the vertices (entities) and edges (relations) that preserve task-relevant structure encoded within the underlying combinatorial object. Such knowledge graph embeddings can be modeled as an approximate global section of a cellular sheaf, an algebraic structure over the graph. Using the diffusion dynamics encoded by the corresponding sheaf Laplacian, we optimally propagate known embeddings of a subgraph to inductively represent new entities introduced into the knowledge graph at inference time. We implement this algorithm via an efficient iterative scheme and show that on a number of large-scale knowledge graph embedding benchmarks, our method is competitive with -- and in some scenarios outperforms -- more complex models derived explicitly for inductive knowledge graph reasoning tasks.

Feature Propagation on Knowledge Graphs using Cellular Sheaves

TL;DR

Abstract

Paper Structure (14 sections, 3 theorems, 32 equations, 7 figures, 4 tables)

This paper contains 14 sections, 3 theorems, 32 equations, 7 figures, 4 tables.

Introduction
Background and Notation
Knowledge Graphs and Their Embeddings
Cellular Sheaves
Harmonic Extension
Classical Harmonic Extension
Harmonic Extension on Knowledge Graphs
Iterative Scheme and Training Rate
Experiments
Semi-Inductive Logical Query Reasoning
Inductive Knowledge Graph Completion
Additional Experimental Details
Choosing Diffusion Iterations
Hits@k

Key Result

Theorem 3.1

Let $G = (\mathcal{E},\mathcal{R},\mathcal{T})$ be a knowledge graph with relation parameters ${\mathbf{R}}_{rv},{\mathbf{R}}_{rw},{\mathbf{r}}_r$ and entity embeddings ${\mathbf{x}}_B$ fixed on a boundary set $B\subset\mathcal{E}$. Assume every interior vertex in $U = \mathcal{E}\setminus B$ has a In particular, ${\mathbf{x}}_U^*$ minimizes $E({\mathbf{x}},G)$ and the scoring function $f^E$ over

Figures (7)

Figure 1: Overview of method. A: An example knowledge graph. $\textbf{B}$: The representations of entities and relations are learned using a knowledge graph embedding method. C: A knowledge graph with new entities (colored red) introduced among existing entities (colored blue). $\textbf{D}$: Representations for new entities are inferred by harmonic extension.
Figure 2: Examples of conjunctive logical query structures considered in this paper. Unknown entities are gray, source entities are colored blue, and target entities are colored red. Evaluating $1p$ queries corresponds to traditional knowledge graph completion.
Figure 3: Model performance across different logical query structures with respect to the ratio of inductive to transductive entities. TransE, RotatE, TransR, and SE models are trained transductively on the 1p query completion task then extended to infer representations for new entities. Performance of Edge-type Heuristic and NodePiece models from galkin2022inductive.
Figure 4: Knowledge graph completion performance in the inductive setting across version splits of each dataset. SE, TransE, RotatE, and TransR models are trained transductively on the knowledge graph completion task then extended to infer entity representations for $\mathcal{E}_{\mathrm{test}}$.
Figure 5: Fully-inductive knowledge graph completion performance with diffusion iterations for SE, TransE, RotatE, and TransR chosen (left) performance is maximized on the test graph and (right) with best performance on the validation graphs Error bars indicate standard error across dataset version splits.
...and 2 more figures

Theorems & Definitions (8)

Definition 1
Definition 2
Theorem 3.1
proof
Remark 1
Theorem 3.2
proof
Corollary 3.3

Feature Propagation on Knowledge Graphs using Cellular Sheaves

TL;DR

Abstract

Feature Propagation on Knowledge Graphs using Cellular Sheaves

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (8)