Table of Contents
Fetching ...

Cluster Specific Representation Learning

Mahalakshmi Sabanayagam, Omar Al-Dabooni, Pascal Esser

TL;DR

This work addresses the intrinsic structure of data by proposing a downstream-agnostic, cluster-specific representation learning framework. It formalizes a tensorized objective that jointly learns cluster assignments and cluster-specific encoders/decoders, and introduces a scalable partial tensorization approach that keeps parameters manageable by sharing a base encoder before cluster-specific heads. The framework is instantiated across Autoencoders, Variational Autoencoders, Contrastive Losses, and Boltzmann Machines (TAE/PTAE, TVAE, TCL, TRBM), with extensive numerical evidence showing improvements in clustering accuracy and denoising when data exhibit inherent clusters. While incurring some runtime and parameter overhead, the approach effectively uncovers intrinsic cluster structure and enhances performance on relevant tasks, offering a general, integrable paradigm for cluster-aware representation learning.

Abstract

Representation learning aims to extract meaningful lower-dimensional embeddings from data, known as representations. Despite its widespread application, there is no established definition of a ``good'' representation. Typically, the representation quality is evaluated based on its performance in downstream tasks such as clustering, de-noising, etc. However, this task-specific approach has a limitation where a representation that performs well for one task may not necessarily be effective for another. This highlights the need for a more agnostic formulation, which is the focus of our work. We propose a downstream-agnostic formulation: when inherent clusters exist in the data, the representations should be specific to each cluster. Under this idea, we develop a meta-algorithm that jointly learns cluster-specific representations and cluster assignments. As our approach is easy to integrate with any representation learning framework, we demonstrate its effectiveness in various setups, including Autoencoders, Variational Autoencoders, Contrastive learning models, and Restricted Boltzmann Machines. We qualitatively compare our cluster-specific embeddings to standard embeddings and downstream tasks such as de-noising and clustering. While our method slightly increases runtime and parameters compared to the standard model, the experiments clearly show that it extracts the inherent cluster structures in the data, resulting in improved performance in relevant applications.

Cluster Specific Representation Learning

TL;DR

This work addresses the intrinsic structure of data by proposing a downstream-agnostic, cluster-specific representation learning framework. It formalizes a tensorized objective that jointly learns cluster assignments and cluster-specific encoders/decoders, and introduces a scalable partial tensorization approach that keeps parameters manageable by sharing a base encoder before cluster-specific heads. The framework is instantiated across Autoencoders, Variational Autoencoders, Contrastive Losses, and Boltzmann Machines (TAE/PTAE, TVAE, TCL, TRBM), with extensive numerical evidence showing improvements in clustering accuracy and denoising when data exhibit inherent clusters. While incurring some runtime and parameter overhead, the approach effectively uncovers intrinsic cluster structure and enhances performance on relevant tasks, offering a general, integrable paradigm for cluster-aware representation learning.

Abstract

Representation learning aims to extract meaningful lower-dimensional embeddings from data, known as representations. Despite its widespread application, there is no established definition of a ``good'' representation. Typically, the representation quality is evaluated based on its performance in downstream tasks such as clustering, de-noising, etc. However, this task-specific approach has a limitation where a representation that performs well for one task may not necessarily be effective for another. This highlights the need for a more agnostic formulation, which is the focus of our work. We propose a downstream-agnostic formulation: when inherent clusters exist in the data, the representations should be specific to each cluster. Under this idea, we develop a meta-algorithm that jointly learns cluster-specific representations and cluster assignments. As our approach is easy to integrate with any representation learning framework, we demonstrate its effectiveness in various setups, including Autoencoders, Variational Autoencoders, Contrastive learning models, and Restricted Boltzmann Machines. We qualitatively compare our cluster-specific embeddings to standard embeddings and downstream tasks such as de-noising and clustering. While our method slightly increases runtime and parameters compared to the standard model, the experiments clearly show that it extracts the inherent cluster structures in the data, resulting in improved performance in relevant applications.

Paper Structure

This paper contains 21 sections, 10 equations, 13 figures.

Figures (13)

  • Figure 1: Representations obtained by a linear AE. Left: full dataset and representation of the full dataset (illustrated by the arrow). Right: cluster specific representations.
  • Figure 2: Performance of AE, TAE and PTAE. Left: clustering obtained by different models with k-means as benchmark. Plotted is ARI (higher is better). Center: de-noising of different models. MSE in log-scale (lower is better). Right: number of parameters in each considered model.
  • Figure 3: Top row: samples from the latent space. The very left plots shows samples from the latent space of the standard VAE. The following three plots show samples from the cluster specific latent spaces of the TVAE. Middle row: embedding of training samples. Embedding of training data-points using the standard VAE and TVAE. Plotted is the obtained mean as well as the contour lines of Gaussian model. Top row: embedding of test samples. Same setting as the middle row but for unseen data-points.
  • Figure 4: Illustration of the embedding obtained from the (tensorized) contrastive loss. Columns:left shows standard CL setting and right shows the TCL setup. Rows:top row shows the embedding obtained without any class information and bottom row shows the embedding obtained under class information.
  • Figure 5: Reconstruction through TRBM. Left: true samples Middle/Right: reconstruction for class one and two.
  • ...and 8 more figures