LT-Soups: Bridging Head and Tail Classes via Subsampled Model Soups
Masih Aminbeidokhti, Subhankar Roy, Eric Granger, Elisa Ricci, Marco Pedersoli
TL;DR
This work addresses the challenge of long-tailed distributions where head classes dominate and tail classes are underrepresented. It introduces LT-Soups, a two-stage model-soup framework that first averages models fine-tuned on subsets with varying imbalance and then retrains only the classifier on the full data, guided by a head-tail ratio $\eta$ and an imbalance ratio $\rho$. The method achieves robust performance across a wide spectrum of LT regimes, outperforming PEFT and traditional model soups on synthetic and real LT benchmarks, while maintaining efficiency. By blending representation-rich fine-tuning with targeted classifier calibration, LT-Soups provides a practical, scalable approach to bridging head and tail performance in foundation-model-based long-tailed recognition.
Abstract
Real-world datasets typically exhibit long-tailed (LT) distributions, where a few head classes dominate and many tail classes are severely underrepresented. While recent work shows that parameter-efficient fine-tuning (PEFT) methods like LoRA and AdaptFormer preserve tail-class performance on foundation models such as CLIP, we find that they do so at the cost of head-class accuracy. We identify the head-tail ratio, the proportion of head to tail classes, as a crucial but overlooked factor influencing this trade-off. Through controlled experiments on CIFAR100 with varying imbalance ratio ($ρ$) and head-tail ratio ($η$), we show that PEFT excels in tail-heavy scenarios but degrades in more balanced and head-heavy distributions. To overcome these limitations, we propose LT-Soups, a two-stage model soups framework designed to generalize across diverse LT regimes. In the first stage, LT-Soups averages models fine-tuned on balanced subsets to reduce head-class bias; in the second, it fine-tunes only the classifier on the full dataset to restore head-class accuracy. Experiments across six benchmark datasets show that LT-Soups achieves superior trade-offs compared to both PEFT and traditional model soups across a wide range of imbalance regimes.
