T-REGS: Minimum Spanning Tree Regularization for Self-Supervised Learning
Julie Mordacq, David Loiseaux, Vicky Kalogeiton, Steve Oudot
TL;DR
This paper tackles dimensional collapse and non-uniformity in self-supervised learning by introducing T-REGS, a simple, GPU-friendly regularization that maximizes the length of the minimum spanning tree (MST) of embeddings while constraining them to a compact manifold (a unit sphere). Theoretical results connect MST length to entropy on Riemannian manifolds, showing that MST-based optimization promotes uniformity and prevents collapse, with distinct analyses for small and large sample regimes. T-REGS extends this idea to SSL by applying MST-based regularization to each SSL branch, either standalone or as an auxiliary term to existing objectives, and demonstrates competitive performance on standard JE-SSL benchmarks and improved cross-modal retrieval in CLIP-style settings. The findings indicate that MST regularization offers a principled, scalable route to richer, more uniformly distributed representations, with practical benefits for both unimodal and multimodal learning tasks.
Abstract
Self-supervised learning (SSL) has emerged as a powerful paradigm for learning representations without labeled data, often by enforcing invariance to input transformations such as rotations or blurring. Recent studies have highlighted two pivotal properties for effective representations: (i) avoiding dimensional collapse-where the learned features occupy only a low-dimensional subspace, and (ii) enhancing uniformity of the induced distribution. In this work, we introduce T-REGS, a simple regularization framework for SSL based on the length of the Minimum Spanning Tree (MST) over the learned representation. We provide theoretical analysis demonstrating that T-REGS simultaneously mitigates dimensional collapse and promotes distribution uniformity on arbitrary compact Riemannian manifolds. Several experiments on synthetic data and on classical SSL benchmarks validate the effectiveness of our approach at enhancing representation quality.
