GenURL: A General Framework for Unsupervised Representation Learning
Siyuan Li, Zicheng Liu, Zelin Zang, Di Wu, Zhiyuan Chen, Stan Z. Li
TL;DR
GenURL presents a unified framework for unsupervised representation learning by simultaneously modeling global data structures (DSM) and low-dimensional embeddings (LDT) through a generalized similarity objective. It introduces static and dynamic input similarities and leverages a General Kullback-Leibler divergence to connect global structures with local transformations, enabling adaptation to DR, GE, SSL, and KD tasks. Across extensive experiments on four URL tasks, GenURL achieves state-of-the-art results and provides detailed analyses of hyperparameters and loss functions, revealing when to emphasize global topology versus local instance discrimination. The approach offers a practical, task-agnostic pathway to robust representations, with insights into the relationships between DR, GE, SSL, and KD and clear guidance for future extensions.
Abstract
Unsupervised representation learning (URL), which learns compact embeddings of high-dimensional data without supervision, has made remarkable progress recently. However, the development of URLs for different requirements is independent, which limits the generalization of the algorithms, especially prohibitive as the number of tasks grows. For example, dimension reduction methods, t-SNE, and UMAP optimize pair-wise data relationships by preserving the global geometric structure, while self-supervised learning, SimCLR, and BYOL focus on mining the local statistics of instances under specific augmentations. To address this dilemma, we summarize and propose a unified similarity-based URL framework, GenURL, which can smoothly adapt to various URL tasks. In this paper, we regard URL tasks as different implicit constraints on the data geometric structure that help to seek optimal low-dimensional representations that boil down to data structural modeling (DSM) and low-dimensional transformation (LDT). Specifically, DMS provides a structure-based submodule to describe the global structures, and LDT learns compact low-dimensional embeddings with given pretext tasks. Moreover, an objective function, General Kullback-Leibler divergence (GKL), is proposed to connect DMS and LDT naturally. Comprehensive experiments demonstrate that GenURL achieves consistent state-of-the-art performance in self-supervised visual learning, unsupervised knowledge distillation (KD), graph embeddings (GE), and dimension reduction.
