Cluster Specific Representation Learning
Mahalakshmi Sabanayagam, Omar Al-Dabooni, Pascal Esser
TL;DR
This work addresses the intrinsic structure of data by proposing a downstream-agnostic, cluster-specific representation learning framework. It formalizes a tensorized objective that jointly learns cluster assignments and cluster-specific encoders/decoders, and introduces a scalable partial tensorization approach that keeps parameters manageable by sharing a base encoder before cluster-specific heads. The framework is instantiated across Autoencoders, Variational Autoencoders, Contrastive Losses, and Boltzmann Machines (TAE/PTAE, TVAE, TCL, TRBM), with extensive numerical evidence showing improvements in clustering accuracy and denoising when data exhibit inherent clusters. While incurring some runtime and parameter overhead, the approach effectively uncovers intrinsic cluster structure and enhances performance on relevant tasks, offering a general, integrable paradigm for cluster-aware representation learning.
Abstract
Representation learning aims to extract meaningful lower-dimensional embeddings from data, known as representations. Despite its widespread application, there is no established definition of a ``good'' representation. Typically, the representation quality is evaluated based on its performance in downstream tasks such as clustering, de-noising, etc. However, this task-specific approach has a limitation where a representation that performs well for one task may not necessarily be effective for another. This highlights the need for a more agnostic formulation, which is the focus of our work. We propose a downstream-agnostic formulation: when inherent clusters exist in the data, the representations should be specific to each cluster. Under this idea, we develop a meta-algorithm that jointly learns cluster-specific representations and cluster assignments. As our approach is easy to integrate with any representation learning framework, we demonstrate its effectiveness in various setups, including Autoencoders, Variational Autoencoders, Contrastive learning models, and Restricted Boltzmann Machines. We qualitatively compare our cluster-specific embeddings to standard embeddings and downstream tasks such as de-noising and clustering. While our method slightly increases runtime and parameters compared to the standard model, the experiments clearly show that it extracts the inherent cluster structures in the data, resulting in improved performance in relevant applications.
