Table of Contents
Fetching ...

Long-Tailed Recognition via Information-Preservable Two-Stage Learning

Fudong Lin, Xu Yuan

TL;DR

This work tackles long-tailed recognition by introducing a two-stage learning framework that first builds high-quality, well-separated feature spaces via Balanced Negative Sampling (BNS), which maximizes mutual information between augmented views and is theoretically tied to minimizing intra-class distance. In the second stage, Information-Preservable Determinantal Point Process (IP-DPP) samples balanced, information-rich subsets using an L-ensemble DPP construction, prioritizing instances with high information content while maintaining diversity. The approach achieves state-of-the-art results across CIFAR-10/100-LT, ImageNet-LT, and iNaturalist 2018, with strong tail performance and competitive overall accuracy, supported by linear probing and ablation studies. By preserving valuable information through IP-DPP and enriching representations through BNS, the method offers a robust, generalizable solution to majority bias in imbalanced data, applicable across architectures and scales.

Abstract

The imbalance (or long-tail) is the nature of many real-world data distributions, which often induces the undesirable bias of deep classification models toward frequent classes, resulting in poor performance for tail classes. In this paper, we propose a novel two-stage learning approach to mitigate such a majority-biased tendency while preserving valuable information within datasets. Specifically, the first stage proposes a new representation learning technique from the information theory perspective. This approach is theoretically equivalent to minimizing intra-class distance, yielding an effective and well-separated feature space. The second stage develops a novel sampling strategy that selects mathematically informative instances, able to rectify majority-biased decision boundaries without compromising a model's overall performance. As a result, our approach achieves the state-of-the-art performance across various long-tailed benchmark datasets, validated via extensive experiments. Our code is available at https://github.com/fudong03/BNS_IPDPP.

Long-Tailed Recognition via Information-Preservable Two-Stage Learning

TL;DR

This work tackles long-tailed recognition by introducing a two-stage learning framework that first builds high-quality, well-separated feature spaces via Balanced Negative Sampling (BNS), which maximizes mutual information between augmented views and is theoretically tied to minimizing intra-class distance. In the second stage, Information-Preservable Determinantal Point Process (IP-DPP) samples balanced, information-rich subsets using an L-ensemble DPP construction, prioritizing instances with high information content while maintaining diversity. The approach achieves state-of-the-art results across CIFAR-10/100-LT, ImageNet-LT, and iNaturalist 2018, with strong tail performance and competitive overall accuracy, supported by linear probing and ablation studies. By preserving valuable information through IP-DPP and enriching representations through BNS, the method offers a robust, generalizable solution to majority bias in imbalanced data, applicable across architectures and scales.

Abstract

The imbalance (or long-tail) is the nature of many real-world data distributions, which often induces the undesirable bias of deep classification models toward frequent classes, resulting in poor performance for tail classes. In this paper, we propose a novel two-stage learning approach to mitigate such a majority-biased tendency while preserving valuable information within datasets. Specifically, the first stage proposes a new representation learning technique from the information theory perspective. This approach is theoretically equivalent to minimizing intra-class distance, yielding an effective and well-separated feature space. The second stage develops a novel sampling strategy that selects mathematically informative instances, able to rectify majority-biased decision boundaries without compromising a model's overall performance. As a result, our approach achieves the state-of-the-art performance across various long-tailed benchmark datasets, validated via extensive experiments. Our code is available at https://github.com/fudong03/BNS_IPDPP.

Paper Structure

This paper contains 30 sections, 9 theorems, 48 equations, 4 figures, 11 tables, 2 algorithms.

Key Result

Theorem 4.1

(Intra-Class Distance Mutual Information Theorem) Let $\mathbb{X}_{Q}^{c}$ and $\mathbb{X}_{V}^{c}$ be two sets of images with the same label $c$, obtained by different data augmentation techniques. Given a feature extractor $f_{\bm{\theta}} ( \cdot )$, we define $\bm{Q}^{c}$ and $\bm{V}^{c}$ as the where $D(\bm{Q}^{c}, \bm{V}^{c})$ can be considered as the intra-class distance because they have t

Figures (4)

  • Figure 1: Linear probing accuracies on (a) CIFAR-10-LT and (b) CIFAR-100-LT datasets.
  • Figure 2: t-SNE visualization of CIFAR-10 feature space. (a) and (b): visual representation learned by SBCL, as well as (c) and (d): visual representation captured by our approach.
  • Figure 3: Linear probing accuracy on CIFAR-10-LT using different numbers of additional positive pairs (i.e., $m$).
  • Figure 4: Impact of the fixed sample size (i.e., $k$) on imbalanced classification, where many-shot, medium-shot, few-shot, and overall accuracies on CIFAR-100 are reported.

Theorems & Definitions (17)

  • Theorem 4.1
  • Definition 4.2: Determinantal Point Process
  • Remark 4.3: Properties of Marginal Kernel Matrix
  • Lemma 4.4
  • Lemma 4.5
  • Theorem 4.6
  • Remark 4.7: Information-Preserving Sampling Principle
  • Theorem A.1
  • proof
  • Lemma A.2
  • ...and 7 more