Table of Contents
Fetching ...

An Efficient NAS-based Approach for Handling Imbalanced Datasets

Zhiwei Yao

TL;DR

A novel approach to enhance performance on long-tailed datasets by optimizing the backbone architecture through neural architecture search (NAS) by efficiently adapting a NAS super-network trained on a balanced source dataset to an imbalanced target dataset.

Abstract

Class imbalance is a common issue in real-world data distributions, negatively impacting the training of accurate classifiers. Traditional approaches to mitigate this problem fall into three main categories: class re-balancing, information transfer, and representation learning. This paper introduces a novel approach to enhance performance on long-tailed datasets by optimizing the backbone architecture through neural architecture search (NAS). Our research shows that an architecture's accuracy on a balanced dataset does not reliably predict its performance on imbalanced datasets. This necessitates a complete NAS run on long-tailed datasets, which can be computationally expensive. To address this computational challenge, we focus on existing work, called IMB-NAS, which proposes efficiently adapting a NAS super-network trained on a balanced source dataset to an imbalanced target dataset. A detailed description of the fundamental techniques for IMB-NAS is provided in this paper, including NAS and architecture transfer. Among various adaptation strategies, we find that the most effective approach is to retrain the linear classification head with reweighted loss while keeping the backbone NAS super-network trained on the balanced source dataset frozen. Finally, we conducted a series of experiments on the imbalanced CIFAR dataset for performance evaluation. Our conclusions are the same as those proposed in the IMB-NAS paper.

An Efficient NAS-based Approach for Handling Imbalanced Datasets

TL;DR

A novel approach to enhance performance on long-tailed datasets by optimizing the backbone architecture through neural architecture search (NAS) by efficiently adapting a NAS super-network trained on a balanced source dataset to an imbalanced target dataset.

Abstract

Class imbalance is a common issue in real-world data distributions, negatively impacting the training of accurate classifiers. Traditional approaches to mitigate this problem fall into three main categories: class re-balancing, information transfer, and representation learning. This paper introduces a novel approach to enhance performance on long-tailed datasets by optimizing the backbone architecture through neural architecture search (NAS). Our research shows that an architecture's accuracy on a balanced dataset does not reliably predict its performance on imbalanced datasets. This necessitates a complete NAS run on long-tailed datasets, which can be computationally expensive. To address this computational challenge, we focus on existing work, called IMB-NAS, which proposes efficiently adapting a NAS super-network trained on a balanced source dataset to an imbalanced target dataset. A detailed description of the fundamental techniques for IMB-NAS is provided in this paper, including NAS and architecture transfer. Among various adaptation strategies, we find that the most effective approach is to retrain the linear classification head with reweighted loss while keeping the backbone NAS super-network trained on the balanced source dataset frozen. Finally, we conducted a series of experiments on the imbalanced CIFAR dataset for performance evaluation. Our conclusions are the same as those proposed in the IMB-NAS paper.

Paper Structure

This paper contains 13 sections, 8 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Long-tailed data distribution.
  • Figure 2: An overview of DARTS: (a) Operations on the edges are initially unspecified. (b) Continuous relaxation of the search space is achieved by placing a mixture of candidate operations on each edge. (c) Joint optimization of the mixing probabilities and the network weights is performed by solving a bilevel optimization problem. (d) The final architecture is derived from the learned mixing probabilities.
  • Figure 3: Four Different Label Distributions of CIFAR-10 dataset.