Table of Contents
Fetching ...

DRESS: Disentangled Representation-based Self-Supervised Meta-Learning for Diverse Tasks

Wei Cui, Tongzi Wu, Jesse C. Cresswell, Yi Sui, Keyvan Golestan

TL;DR

The paper investigates why meta-learning often underperforms pre-training in few-shot scenarios, attributing it to insufficient task diversity. It introduces DRESS, a framework that uses disentangled representations to build diverse self-supervised meta-training tasks and a class-partition-based diversity metric to quantify task diversity. DRESS can pair with optimization-based meta-learners like MAML and leverages encoders such as FDAE or LSD to create per-dimension pseudo-classes for many tasks. Empirical results on SmallNORB, Shapes3D, Causal3D, MPI3D, and CelebA show DRESS often surpasses unsupervised baselines and approaches supervised baselines, highlighting the importance of task diversity and disentangled representations for robust fast adaptation.

Abstract

Meta-learning represents a strong class of approaches for solving few-shot learning tasks. Nonetheless, recent research suggests that simply pre-training a generic encoder can potentially surpass meta-learning algorithms. In this paper, we first discuss the reasons why meta-learning fails to stand out in these few-shot learning experiments, and hypothesize that it is due to the few-shot learning tasks lacking diversity. We propose DRESS, a task-agnostic Disentangled REpresentation-based Self-Supervised meta-learning approach that enables fast model adaptation on highly diversified few-shot learning tasks. Specifically, DRESS utilizes disentangled representation learning to create self-supervised tasks that can fuel the meta-training process. Furthermore, we also propose a class-partition based metric for quantifying the task diversity directly on the input space. We validate the effectiveness of DRESS through experiments on datasets with multiple factors of variation and varying complexity. The results suggest that DRESS is able to outperform competing methods on the majority of the datasets and task setups. Through this paper, we advocate for a re-examination of proper setups for task adaptation studies, and aim to reignite interest in the potential of meta-learning for solving few-shot learning tasks via disentangled representations.

DRESS: Disentangled Representation-based Self-Supervised Meta-Learning for Diverse Tasks

TL;DR

The paper investigates why meta-learning often underperforms pre-training in few-shot scenarios, attributing it to insufficient task diversity. It introduces DRESS, a framework that uses disentangled representations to build diverse self-supervised meta-training tasks and a class-partition-based diversity metric to quantify task diversity. DRESS can pair with optimization-based meta-learners like MAML and leverages encoders such as FDAE or LSD to create per-dimension pseudo-classes for many tasks. Empirical results on SmallNORB, Shapes3D, Causal3D, MPI3D, and CelebA show DRESS often surpasses unsupervised baselines and approaches supervised baselines, highlighting the importance of task diversity and disentangled representations for robust fast adaptation.

Abstract

Meta-learning represents a strong class of approaches for solving few-shot learning tasks. Nonetheless, recent research suggests that simply pre-training a generic encoder can potentially surpass meta-learning algorithms. In this paper, we first discuss the reasons why meta-learning fails to stand out in these few-shot learning experiments, and hypothesize that it is due to the few-shot learning tasks lacking diversity. We propose DRESS, a task-agnostic Disentangled REpresentation-based Self-Supervised meta-learning approach that enables fast model adaptation on highly diversified few-shot learning tasks. Specifically, DRESS utilizes disentangled representation learning to create self-supervised tasks that can fuel the meta-training process. Furthermore, we also propose a class-partition based metric for quantifying the task diversity directly on the input space. We validate the effectiveness of DRESS through experiments on datasets with multiple factors of variation and varying complexity. The results suggest that DRESS is able to outperform competing methods on the majority of the datasets and task setups. Through this paper, we advocate for a re-examination of proper setups for task adaptation studies, and aim to reignite interest in the potential of meta-learning for solving few-shot learning tasks via disentangled representations.

Paper Structure

This paper contains 33 sections, 3 equations, 7 figures, 12 tables, 1 algorithm.

Figures (7)

  • Figure 1: DRESS creates diversified self-supervised meta-training tasks through disentanglement learning. Images are first encoded into disentangled latent representations. The latent representations are then semantically aligned across the dataset so that sets of clusters can be formed on each latent dimension individually. Each set of clusters acts as pseudo-classes for a distinct self-supervised classification task. Hence, each disentangled latent dimension creates a meta-learning task with its own unique nature.
  • Figure 2: Illustration of class-partition based task diversity. A binary classification task is defined by two ellipses of the same color on an input space. Left: Two similar tasks where classes have high overlap in terms of data points. Right: Two dissimilar tasks, with less overlap between the class partitions.
  • Figure 3: Two self-supervised tasks constructed by DRESS on MPI3D. Left: The task focuses on classifying the object color. Right: The task focuses on identifying the robot arm angle.
  • Figure 4: Two self-supervised tasks constructed by DRESS on CelebA. Left: The task focuses on identifying if the person wears eyeglasses or not. Right: The task focuses on identifying if the person has hair bangs (hair curtain covering the forehead) or not.
  • Figure 5: During the meta-training stage, the model adapts on batches of sampled tasks. The model's performance is optimized for meta-parameter optimization. After meta-training, the model can be quickly adapted to meta-testing tasks and perform few-shot inference.
  • ...and 2 more figures