Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts
Jiang-Xin Shi, Tong Wei, Zhi Zhou, Jie-Jing Shao, Xin-Yan Han, Yu-Feng Li
TL;DR
The paper reveals that heavy fine-tuning of foundation models can deteriorate tail-class performance in long-tail learning. It introduces LIFT, a lightweight, single-stage fine-tuning framework that uses structured lightweight modules, semantic-aware initialization, and test-time ensembling to preserve class-conditional distributions while boosting discriminative power. LIFT achieves competitive or superior results across ImageNet-LT, Places-LT, iNaturalist 2018, and CIFAR-100-LT with far fewer tunable parameters and epochs, often without external data. The work demonstrates rapid convergence (often under 20 epochs) and practical efficiency, providing a robust pathway for deploying foundation-model-based long-tail learners.
Abstract
The fine-tuning paradigm in addressing long-tail learning tasks has sparked significant interest since the emergence of foundation models. Nonetheless, how fine-tuning impacts performance in long-tail learning was not explicitly quantified. In this paper, we disclose that heavy fine-tuning may even lead to non-negligible performance deterioration on tail classes, and lightweight fine-tuning is more effective. The reason is attributed to inconsistent class conditions caused by heavy fine-tuning. With the observation above, we develop a low-complexity and accurate long-tail learning algorithms LIFT with the goal of facilitating fast prediction and compact models by adaptive lightweight fine-tuning. Experiments clearly verify that both the training time and the learned parameters are significantly reduced with more accurate predictive performance compared with state-of-the-art approaches. The implementation code is available at https://github.com/shijxcs/LIFT.
