On the Universality of Self-Supervised Learning
Wenwen Qiang, Jingyao Wang, Changwen Zheng, Hui Xiong, Gang Hua
TL;DR
The paper addresses what constitutes a good self-supervised representation by defining SSL universality as discriminability, generalizability, and transferability, and then explicitly modeling these properties via General SSL (GeSSL). GeSSL uses a bi-level optimization framework where an inner loop learns a proxy model $f'$ on a support set with a discriminative loss $L_{disc}$ guided by an auxiliary network $g$, while an outer loop updates the base model $f$ and threshold model $g$ using a query set to ensure cross-task generalization. The authors prove a generalization bound showing bounded risk on unseen tasks and demonstrate strong empirical gains across unsupervised, semi-supervised, transfer, and few-shot benchmarks, validating the universality-driven approach. Overall, GeSSL offers a principled path to universal representations in SSL with solid theory and broad empirical coverage.
Abstract
In this paper, we investigate what constitutes a good representation or model in self-supervised learning (SSL). We argue that a good representation should exhibit universality, characterized by three essential properties: discriminability, generalizability, and transferability. While these capabilities are implicitly desired in most SSL frameworks, existing methods lack an explicit modeling of universality, and its theoretical foundations remain underexplored. To address these gaps, we propose General SSL (GeSSL), a novel framework that explicitly models universality from three complementary dimensions: the optimization objective, the parameter update mechanism, and the learning paradigm. GeSSL integrates a bi-level optimization structure that jointly models task-specific adaptation and cross-task consistency, thereby capturing all three aspects of universality within a unified SSL objective. Furthermore, we derive a theoretical generalization bound, ensuring that the optimization process of GeSSL consistently leads to representations that generalize well to unseen tasks. Empirical results on multiple benchmark datasets demonstrate that GeSSL consistently achieves superior performance across diverse downstream tasks, validating its effectiveness in modeling universal representations.
