Table of Contents
Fetching ...

When resampling/reweighting improves feature learning in imbalanced classification?: A toy-model study

Tomoyuki Obuchi, Toshiyuki Tanaka

TL;DR

The paper addresses why class-wise resampling and reweighting sometimes fail to improve feature learning in imbalanced classification. It develops a toy Gaussian class-conditional model analyzed in the high-dimensional limit via the replica method under the RS ansatz to derive EOS for order parameters, notably the overlap $m$ with the true discriminative direction. A key finding is that when class variances are equal and the decision boundary is placed equidistantly from class centers, the best feature-learning performance occurs with no resampling/reweighting (i.e., $s_+=\tfrac{1}{2}$), a result that generalizes across losses and classifiers under symmetry; this provides analytical support for Kang et al.'s observations. Numerical experiments corroborate the theory in equal-variance settings and reveal deviations in nonequal-variance cases, and the authors extend the idea to a simplified multiclass model, highlighting when resampling is beneficial for feature learning. The work offers theoretical guidance on when resampling strategies improve representation learning and connects to broader phenomena like neural collapse in deep networks.

Abstract

A toy model of binary classification is studied with the aim of clarifying the class-wise resampling/reweighting effect on the feature learning performance under the presence of class imbalance. In the analysis, a high-dimensional limit of the input space is taken while keeping the ratio of the dataset size against the input dimension finite and the non-rigorous replica method from statistical mechanics is employed. The result shows that there exists a case in which the no resampling/reweighting situation gives the best feature learning performance irrespectively of the choice of losses or classifiers, supporting recent findings in Cao et al. (2019); Kang et al. (2019). It is also revealed that the key of the result is the symmetry of the loss and the problem setting. Inspired by this, we propose a further simplified model exhibiting the same property in the multiclass setting. These clarify when the class-wise resampling/reweighting becomes effective in imbalanced classification.

When resampling/reweighting improves feature learning in imbalanced classification?: A toy-model study

TL;DR

The paper addresses why class-wise resampling and reweighting sometimes fail to improve feature learning in imbalanced classification. It develops a toy Gaussian class-conditional model analyzed in the high-dimensional limit via the replica method under the RS ansatz to derive EOS for order parameters, notably the overlap with the true discriminative direction. A key finding is that when class variances are equal and the decision boundary is placed equidistantly from class centers, the best feature-learning performance occurs with no resampling/reweighting (i.e., ), a result that generalizes across losses and classifiers under symmetry; this provides analytical support for Kang et al.'s observations. Numerical experiments corroborate the theory in equal-variance settings and reveal deviations in nonequal-variance cases, and the authors extend the idea to a simplified multiclass model, highlighting when resampling is beneficial for feature learning. The work offers theoretical guidance on when resampling strategies improve representation learning and connects to broader phenomena like neural collapse in deep networks.

Abstract

A toy model of binary classification is studied with the aim of clarifying the class-wise resampling/reweighting effect on the feature learning performance under the presence of class imbalance. In the analysis, a high-dimensional limit of the input space is taken while keeping the ratio of the dataset size against the input dimension finite and the non-rigorous replica method from statistical mechanics is employed. The result shows that there exists a case in which the no resampling/reweighting situation gives the best feature learning performance irrespectively of the choice of losses or classifiers, supporting recent findings in Cao et al. (2019); Kang et al. (2019). It is also revealed that the key of the result is the symmetry of the loss and the problem setting. Inspired by this, we propose a further simplified model exhibiting the same property in the multiclass setting. These clarify when the class-wise resampling/reweighting becomes effective in imbalanced classification.
Paper Structure (25 sections, 64 equations, 14 figures)

This paper contains 25 sections, 64 equations, 14 figures.

Figures (14)

  • Figure 1: PDFs of $\bm{x}$ projected onto $\bm{w} _0$ with $\sigma=0.6$ for $r_+=0.5$ (left) and $0.2$ (right).
  • Figure 2: Plots of $m$ and $u$ against $b$ in the balanced case $r_+=0.5$ for $s_+=0.1$ (left), $0.5$ (middle), and $0.9$ (right). (a) Zero-one loss with perceptron $\mathcal{\ell}_{\mathrm{01pe}}$. (b) Cross-entropy loss with logistic function $\mathcal{\ell}_{\mathrm{CElo}}$.
  • Figure 3: Plots of $m$ and $u$ against $b$ in the imbalanced case $r_+=0.2$ for $s_+=0.1$ (left), $0.5$ (middle), and $0.9$ (right). (a) Zero-one loss with perceptron $\mathcal{\ell}_{\mathrm{01pe}}$. (b) Cross-entropy loss with logistic function $\mathcal{\ell}_{\mathrm{CElo}}$.
  • Figure 4: Plots of $m_{\rm max},u(m_{\rm max}),b(m_{\rm max})$ against $s_+$ for $r_+=0.5$ (left) and $r_+=0.2$ (right). (a) Zero-one loss with perceptron $\mathcal{\ell}_{\mathrm{01pe}}$. (b) Cross-entropy loss with logistic function $\mathcal{\ell}_{\mathrm{CElo}}$.
  • Figure 5: Plots of $m(u_{\rm min}),u_{\rm min},b(u_{\rm min})$ against $s_+$ for $r_+=0.5$ (left) and $r_+=0.2$ (right). (a) Zero-one loss with perceptron $\mathcal{\ell}_{\mathrm{01pe}}$. (b) Cross-entropy loss with logistic function $\mathcal{\ell}_{\mathrm{CElo}}$.
  • ...and 9 more figures