Table of Contents
Fetching ...

LTRL: Boosting Long-tail Recognition via Reflective Learning

Qihao Zhao, Yalun Dai, Shen Lin, Wei Hu, Fan Zhang, Jun Liu

TL;DR

This work tackles long-tail recognition by introducing Reflective Learning (RL), a plug-and-play paradigm that mimics human review, summarization, and correction to balance head and tail classes. RL comprises three modules: Knowledge Review ($ ext{L}_{KR}$, KL divergence between past and present predictions on correctly classified instances), Knowledge Summary ($ ext{L}_{KS}$, soft class-correlation labels derived from feature-center cosine similarity), and Knowledge Correction (gradient projection to resolve negative transfer) across training. Empirical results on CIFAR100-LT, ImageNet-LT, Places-LT, and iNaturalist show consistent gains over state-of-the-art LT methods, especially for tail classes, while remaining compatible with diverse backbones. The approach is lightweight and broadly applicable, offering a practical route to more balanced long-tail recognition and potential extensions to other domains.

Abstract

In real-world scenarios, where knowledge distributions exhibit long-tail. Humans manage to master knowledge uniformly across imbalanced distributions, a feat attributed to their diligent practices of reviewing, summarizing, and correcting errors. Motivated by this learning process, we propose a novel learning paradigm, called reflecting learning, in handling long-tail recognition. Our method integrates three processes for reviewing past predictions during training, summarizing and leveraging the feature relation across classes, and correcting gradient conflict for loss functions. These designs are lightweight enough to plug and play with existing long-tail learning methods, achieving state-of-the-art performance in popular long-tail visual benchmarks. The experimental results highlight the great potential of reflecting learning in dealing with long-tail recognition.

LTRL: Boosting Long-tail Recognition via Reflective Learning

TL;DR

This work tackles long-tail recognition by introducing Reflective Learning (RL), a plug-and-play paradigm that mimics human review, summarization, and correction to balance head and tail classes. RL comprises three modules: Knowledge Review (, KL divergence between past and present predictions on correctly classified instances), Knowledge Summary (, soft class-correlation labels derived from feature-center cosine similarity), and Knowledge Correction (gradient projection to resolve negative transfer) across training. Empirical results on CIFAR100-LT, ImageNet-LT, Places-LT, and iNaturalist show consistent gains over state-of-the-art LT methods, especially for tail classes, while remaining compatible with diverse backbones. The approach is lightweight and broadly applicable, offering a practical route to more balanced long-tail recognition and potential extensions to other domains.

Abstract

In real-world scenarios, where knowledge distributions exhibit long-tail. Humans manage to master knowledge uniformly across imbalanced distributions, a feat attributed to their diligent practices of reviewing, summarizing, and correcting errors. Motivated by this learning process, we propose a novel learning paradigm, called reflecting learning, in handling long-tail recognition. Our method integrates three processes for reviewing past predictions during training, summarizing and leveraging the feature relation across classes, and correcting gradient conflict for loss functions. These designs are lightweight enough to plug and play with existing long-tail learning methods, achieving state-of-the-art performance in popular long-tail visual benchmarks. The experimental results highlight the great potential of reflecting learning in dealing with long-tail recognition.
Paper Structure (12 sections, 12 equations, 12 figures, 1 table)

This paper contains 12 sections, 12 equations, 12 figures, 1 table.

Figures (12)

  • Figure 1: The comparisons of model outputs (logits) and Kullback–Leibler (KL) distance. The analysis is conducted on CIFAR100-LT dataset with an Imbalanced Factor (IF) of 100. The logits, KL distance, and accuracy are visualized on the basis of the whole test set and then the average results of each category are counted and reported. (a): The dashed line represents the direction of the long-tail distribution in data volume, and the prediction consistency (Overlap) of the head class is significantly higher than that of the tail class. (b) and (c): The figure compares the per-class KL-Divergence and top-1 accuracy results of Cross-Entropy (CE) on Long-Tail Data (LTD) and Balanced Data (BD), as well as the results on LTD after incorporating our proposed method. Compared to the original Cross-Entropy, our method not only significantly reduces the overall prediction divergence but also alleviates the divergence imbalance caused by the inconsistency in predictions between head and tail classes. Concurrently, our method significantly enhances the model's accuracy on the test set and mitigates the phenomenon where the head class accuracy substantially surpasses that of the tail class due to data imbalance.
  • Figure 2: Correlation of features among different samples in long-tailed data.
  • Figure 3: The framework of our method. The prediction of the previous epoch (t-1) serves as a soft label to regularize the prediction of the current epoch (t). During the regularization process, we first use Correctly Classified Instances (CCI) to filter out correctly predicted samples (indicated in green). Then, we employ the Knowledge Review module to regularize the uncertainty between the logits-based prediction from past and current epochs. Meanwhile, we compute the median of the features from the previous epoch to represent the characteristic features. Then the inter-class features-wise correlations are characterized using cosine similarity, resulting in a similarity matrix that serves as soft class-correlation labels for each category. By integrating these soft labels with one-hot labels in a weighted manner, we derive the ultimate supervisory labels for the model's learning process, a method we term Knowledge Summary. Finally, the proposed Knowledge Correction module is used to rectify gradient conflicts during training.
  • Figure 4: (a) Illustration of gradient conflicts. (Top) Optimizing according to Eq. 5. (Bottom) Optimizing according to ours. (b) The proportion of conflict gradients contained in each layer of the model (total 216 layers for Resnet-32). (c)The proportion of layers in the network containing gradient conflicts at every epochs.
  • Figure 5: Comparisons on CIFAR100-LT datasets with the IF of 10, 50, and 100. † denotes models trained with RandAugmentcubuk2020randaugment for 400 epochs.
  • ...and 7 more figures