Table of Contents
Fetching ...

Reviving Undersampling for Long-Tailed Learning

Hao Yu, Yingxiao Du, Jianxin Wu

TL;DR

This work tackles long-tailed recognition by shifting focus to worst-performing classes using harmonic and geometric means. It introduces Balanced Training and Merging (BTM), a plug-in pipeline that fine-tunes multiple few-shot balanced subsets and merges them by averaging to improve worst-case per-class accuracy with little or no loss in average accuracy. Across Places-LT, ImageNet-LT, and iNaturalist2018, BTM yields substantial gains in harmonic and geometric means, and can be combined with methods like GML for further improvements, all while preserving inference efficiency. The approach is lightweight, broadly compatible with existing decoupling strategies, and supported by public code, making it practical for real-world long-tailed learning deployments.

Abstract

The training datasets used in long-tailed recognition are extremely unbalanced, resulting in significant variation in per-class accuracy across categories. Prior works mostly used average accuracy to evaluate their algorithms, which easily ignores those worst-performing categories. In this paper, we aim to enhance the accuracy of the worst-performing categories and utilize the harmonic mean and geometric mean to assess the model's performance. We revive the balanced undersampling idea to achieve this goal. In few-shot learning, balanced subsets are few-shot and will surely under-fit, hence it is not used in modern long-tailed learning. But, we find that it produces a more equitable distribution of accuracy across categories with much higher harmonic and geometric mean accuracy, and, but lower average accuracy. Moreover, we devise a straightforward model ensemble strategy, which does not result in any additional overhead and achieves improved harmonic and geometric mean while keeping the average accuracy almost intact when compared to state-of-the-art long-tailed learning methods. We validate the effectiveness of our approach on widely utilized benchmark datasets for long-tailed learning. Our code is at \href{https://github.com/yuhao318/BTM/}{https://github.com/yuhao318/BTM/}.

Reviving Undersampling for Long-Tailed Learning

TL;DR

This work tackles long-tailed recognition by shifting focus to worst-performing classes using harmonic and geometric means. It introduces Balanced Training and Merging (BTM), a plug-in pipeline that fine-tunes multiple few-shot balanced subsets and merges them by averaging to improve worst-case per-class accuracy with little or no loss in average accuracy. Across Places-LT, ImageNet-LT, and iNaturalist2018, BTM yields substantial gains in harmonic and geometric means, and can be combined with methods like GML for further improvements, all while preserving inference efficiency. The approach is lightweight, broadly compatible with existing decoupling strategies, and supported by public code, making it practical for real-world long-tailed learning deployments.

Abstract

The training datasets used in long-tailed recognition are extremely unbalanced, resulting in significant variation in per-class accuracy across categories. Prior works mostly used average accuracy to evaluate their algorithms, which easily ignores those worst-performing categories. In this paper, we aim to enhance the accuracy of the worst-performing categories and utilize the harmonic mean and geometric mean to assess the model's performance. We revive the balanced undersampling idea to achieve this goal. In few-shot learning, balanced subsets are few-shot and will surely under-fit, hence it is not used in modern long-tailed learning. But, we find that it produces a more equitable distribution of accuracy across categories with much higher harmonic and geometric mean accuracy, and, but lower average accuracy. Moreover, we devise a straightforward model ensemble strategy, which does not result in any additional overhead and achieves improved harmonic and geometric mean while keeping the average accuracy almost intact when compared to state-of-the-art long-tailed learning methods. We validate the effectiveness of our approach on widely utilized benchmark datasets for long-tailed learning. Our code is at \href{https://github.com/yuhao318/BTM/}{https://github.com/yuhao318/BTM/}.
Paper Structure (19 sections, 3 equations, 3 figures, 14 tables)

This paper contains 19 sections, 3 equations, 3 figures, 14 tables.

Figures (3)

  • Figure 1: \ref{['fig:hmean_ori_ft']} and \ref{['fig:gmean_ori_ft']} present the harmonic and geometric mean of interpolated models between the raw model $f$ ($\lambda=0$) and the fine-tuned model $f^{ D}$ ($\lambda=1$), respectively.
  • Figure 2: The blue curves in \ref{['fig:hmean_ft_ft']} and \ref{['fig:gmean_ft_ft']} present the harmonic and geometric mean of interpolated models between the fine-tuned model $f^{ D_A}$ ($\lambda=0$) and the fine-tuned model $f^{ D_B}$ ($\lambda=1$), respectively. The yellow curves mean the harmonic and geometric mean of $f^{ D_{A\cup B}}$.
  • Figure 3: Visualization of the change in the distribution of per-class recall (i.e., accuracy). (\ref{['fig:per-class-acc-stage1-merged']}) shows that by performing balanced training on our sampled few-shot datasets and later merging all models together, we are able to greatly improve the performance of the model. (\ref{['fig:per-class-acc-stage2-final']}) is the comparison of per-class accuracy between our final model and MiSLAS.