Table of Contents
Fetching ...

Revisiting Long-Tailed Learning: Insights from an Architectural Perspective

Yuhan Pan, Yanan Sun, Wei Gong

TL;DR

This work addresses long-tailed recognition by shifting focus from data and losses to neural architecture design. By systematically analyzing architectural components, the authors identify bottleneck topology, aggregated/hierarchical convolutions, activation placement, and BatchNorm as LT-friendly factors, and propose two LT-specific convolutions, LT-AggConv and LT-HierConv. They then introduce LT-DARTS, a LT-aware neural architecture search method featuring an LT-friendly search space and a Balanced Fixed Classifier to mitigate bias during search. Across CIFAR-LT, Places-LT, ImageNet-LT, and iNaturalist-LT, LT-DARTS delivers consistent architectural gains, achieving state-of-the-art results when combined with existing LT techniques and reducing tail-class error without sacrificing head-class performance. The findings demonstrate that architecture design is a powerful, orthogonal lever for improving LT performance and can be readily integrated with prevailing LT strategies for practical impact.

Abstract

Long-Tailed (LT) recognition has been widely studied to tackle the challenge of imbalanced data distributions in real-world applications. However, the design of neural architectures for LT settings has received limited attention, despite evidence showing that architecture choices can substantially affect performance. This paper aims to bridge the gap between LT challenges and neural network design by providing an in-depth analysis of how various architectures influence LT performance. Specifically, we systematically examine the effects of key network components on LT handling, such as topology, convolutions, and activation functions. Based on these observations, we propose two convolutional operations optimized for improved performance. Recognizing that operation interactions are also crucial to network effectiveness, we apply Neural Architecture Search (NAS) to facilitate efficient exploration. We propose LT-DARTS, a NAS method with a novel search space and search strategy specifically designed for LT data. Experimental results demonstrate that our approach consistently outperforms existing architectures across multiple LT datasets, achieving parameter-efficient, state-of-the-art results when integrated with current LT methods.

Revisiting Long-Tailed Learning: Insights from an Architectural Perspective

TL;DR

This work addresses long-tailed recognition by shifting focus from data and losses to neural architecture design. By systematically analyzing architectural components, the authors identify bottleneck topology, aggregated/hierarchical convolutions, activation placement, and BatchNorm as LT-friendly factors, and propose two LT-specific convolutions, LT-AggConv and LT-HierConv. They then introduce LT-DARTS, a LT-aware neural architecture search method featuring an LT-friendly search space and a Balanced Fixed Classifier to mitigate bias during search. Across CIFAR-LT, Places-LT, ImageNet-LT, and iNaturalist-LT, LT-DARTS delivers consistent architectural gains, achieving state-of-the-art results when combined with existing LT techniques and reducing tail-class error without sacrificing head-class performance. The findings demonstrate that architecture design is a powerful, orthogonal lever for improving LT performance and can be readily integrated with prevailing LT strategies for practical impact.

Abstract

Long-Tailed (LT) recognition has been widely studied to tackle the challenge of imbalanced data distributions in real-world applications. However, the design of neural architectures for LT settings has received limited attention, despite evidence showing that architecture choices can substantially affect performance. This paper aims to bridge the gap between LT challenges and neural network design by providing an in-depth analysis of how various architectures influence LT performance. Specifically, we systematically examine the effects of key network components on LT handling, such as topology, convolutions, and activation functions. Based on these observations, we propose two convolutional operations optimized for improved performance. Recognizing that operation interactions are also crucial to network effectiveness, we apply Neural Architecture Search (NAS) to facilitate efficient exploration. We propose LT-DARTS, a NAS method with a novel search space and search strategy specifically designed for LT data. Experimental results demonstrate that our approach consistently outperforms existing architectures across multiple LT datasets, achieving parameter-efficient, state-of-the-art results when integrated with current LT methods.

Paper Structure

This paper contains 30 sections, 4 equations, 13 figures, 6 tables, 1 algorithm.

Figures (13)

  • Figure 1: The performance of different architectures on the long-tailed CIFAR-10-LT dataset. The direction of black dashed lines indicates better architectures, as they achieve higher accuracy with fewer architectural parameters.
  • Figure 2: Heatmap of Kendall's Tau correlation under different imbalance factors. A lower tau (lighter color) indicates weaker performance correlation.
  • Figure 3: An overview of the architectural properties we explore. "Basic" represents the foundational architecture. "Bottleneck" refers to an improved topology, with the activation functions and normalization methods for each layer omitted for simplicity. Additionally, we investigate the design of convolution, placement of activation, normalization, and activation.
  • Figure 4: Comparison between "Basic" and "Bottleneck", and $\rho$ denotes the imbalance ratio.
  • Figure 5: Performance of different convolutions on CIFAR-10-LT with $\rho=\{50, 100\}$.
  • ...and 8 more figures