SnapE -- Training Snapshot Ensembles of Link Prediction Models

Ali Shaban; Heiko Paulheim

SnapE -- Training Snapshot Ensembles of Link Prediction Models

Ali Shaban, Heiko Paulheim

TL;DR

The paper tackles the challenge of robust link prediction in sparse knowledge graphs by employing snapshot ensembles (SnapE) within the same training budget as a single model. It adapts cyclic learning-rate schedules to store multiple diverse snapshots and introduces an extended negative sampling strategy that uses prior snapshots to generate hard negatives, while combining predictions via normalization and weighting. Across four datasets and four base KGEs, SnapE delivers consistent improvements over single-model baselines and offers favorable trade-offs compared to low-dimensional ensemble approaches, without increasing training cost. The work also analyzes runtime, diversity of models, and ablations, outlining directions for broader applicability to other graph methods and downstream tasks.

Abstract

Snapshot ensembles have been widely used in various fields of prediction. They allow for training an ensemble of prediction models at the cost of training a single one. They are known to yield more robust predictions by creating a set of diverse base models. In this paper, we introduce an approach to transfer the idea of snapshot ensembles to link prediction models in knowledge graphs. Moreover, since link prediction in knowledge graphs is a setup without explicit negative examples, we propose a novel training loop that iteratively creates negative examples using previous snapshot models. An evaluation with four base models across four datasets shows that this approach constantly outperforms the single model approach, while keeping the training time constant.

SnapE -- Training Snapshot Ensembles of Link Prediction Models

TL;DR

Abstract

Paper Structure (15 sections, 2 equations, 9 figures, 5 tables)

This paper contains 15 sections, 2 equations, 9 figures, 5 tables.

Introduction
Related Work
Approach
Learning Rate Schedulers
Combining Predictions
Negative Samplers
Evaluation
Datasets
Base Models
Parameters
Results
Runtime Behavior
Model Variety
Ablation Study
Conclusion and Outlook

Figures (9)

Figure 1: Non-cyclic vs. cyclic learning rate schedules
Figure 2: Learning rate schedules used in this paper. Snapshots are always stored at the minima of the learning rate schedules.
Figure 3: Usage of the extended negative sampler in the cosine annealing (left) and deferred cosine annealing (right) setting.
Figure 4: Illustration of the overall approach
Figure 5: Results of SnapE compared to baselines and to the Mbase ensemble approach. Ensemble approaches outperforming their representative baselines are marked in bold. The best overall results are underlined. The last column shows the average change in HITS@10 over the respective base model across all four datasets. Moreover, we show the average change of each metric for each dataset across all models.
...and 4 more figures

SnapE -- Training Snapshot Ensembles of Link Prediction Models

TL;DR

Abstract

SnapE -- Training Snapshot Ensembles of Link Prediction Models

Authors

TL;DR

Abstract

Table of Contents

Figures (9)