Table of Contents
Fetching ...

LT-Soups: Bridging Head and Tail Classes via Subsampled Model Soups

Masih Aminbeidokhti, Subhankar Roy, Eric Granger, Elisa Ricci, Marco Pedersoli

TL;DR

This work addresses the challenge of long-tailed distributions where head classes dominate and tail classes are underrepresented. It introduces LT-Soups, a two-stage model-soup framework that first averages models fine-tuned on subsets with varying imbalance and then retrains only the classifier on the full data, guided by a head-tail ratio $\eta$ and an imbalance ratio $\rho$. The method achieves robust performance across a wide spectrum of LT regimes, outperforming PEFT and traditional model soups on synthetic and real LT benchmarks, while maintaining efficiency. By blending representation-rich fine-tuning with targeted classifier calibration, LT-Soups provides a practical, scalable approach to bridging head and tail performance in foundation-model-based long-tailed recognition.

Abstract

Real-world datasets typically exhibit long-tailed (LT) distributions, where a few head classes dominate and many tail classes are severely underrepresented. While recent work shows that parameter-efficient fine-tuning (PEFT) methods like LoRA and AdaptFormer preserve tail-class performance on foundation models such as CLIP, we find that they do so at the cost of head-class accuracy. We identify the head-tail ratio, the proportion of head to tail classes, as a crucial but overlooked factor influencing this trade-off. Through controlled experiments on CIFAR100 with varying imbalance ratio ($ρ$) and head-tail ratio ($η$), we show that PEFT excels in tail-heavy scenarios but degrades in more balanced and head-heavy distributions. To overcome these limitations, we propose LT-Soups, a two-stage model soups framework designed to generalize across diverse LT regimes. In the first stage, LT-Soups averages models fine-tuned on balanced subsets to reduce head-class bias; in the second, it fine-tunes only the classifier on the full dataset to restore head-class accuracy. Experiments across six benchmark datasets show that LT-Soups achieves superior trade-offs compared to both PEFT and traditional model soups across a wide range of imbalance regimes.

LT-Soups: Bridging Head and Tail Classes via Subsampled Model Soups

TL;DR

This work addresses the challenge of long-tailed distributions where head classes dominate and tail classes are underrepresented. It introduces LT-Soups, a two-stage model-soup framework that first averages models fine-tuned on subsets with varying imbalance and then retrains only the classifier on the full data, guided by a head-tail ratio and an imbalance ratio . The method achieves robust performance across a wide spectrum of LT regimes, outperforming PEFT and traditional model soups on synthetic and real LT benchmarks, while maintaining efficiency. By blending representation-rich fine-tuning with targeted classifier calibration, LT-Soups provides a practical, scalable approach to bridging head and tail performance in foundation-model-based long-tailed recognition.

Abstract

Real-world datasets typically exhibit long-tailed (LT) distributions, where a few head classes dominate and many tail classes are severely underrepresented. While recent work shows that parameter-efficient fine-tuning (PEFT) methods like LoRA and AdaptFormer preserve tail-class performance on foundation models such as CLIP, we find that they do so at the cost of head-class accuracy. We identify the head-tail ratio, the proportion of head to tail classes, as a crucial but overlooked factor influencing this trade-off. Through controlled experiments on CIFAR100 with varying imbalance ratio () and head-tail ratio (), we show that PEFT excels in tail-heavy scenarios but degrades in more balanced and head-heavy distributions. To overcome these limitations, we propose LT-Soups, a two-stage model soups framework designed to generalize across diverse LT regimes. In the first stage, LT-Soups averages models fine-tuned on balanced subsets to reduce head-class bias; in the second, it fine-tunes only the classifier on the full dataset to restore head-class accuracy. Experiments across six benchmark datasets show that LT-Soups achieves superior trade-offs compared to both PEFT and traditional model soups across a wide range of imbalance regimes.

Paper Structure

This paper contains 41 sections, 1 equation, 7 figures, 18 tables, 1 algorithm.

Figures (7)

  • Figure 1: Performance of baselines and LT-Soups on the CIFAR100 benchmark varying $\rho$ and $\eta$. While full fine-tuning generally outperforms PEFT on head classes, PEFT demonstrates superior performance on tail classes. In contrast, our approach maintains robust accuracy across all imbalance settings, showing resilience to shifts in both the sample distribution and class structure.
  • Figure 2: Marginalized performance of baselines, including LT-Soups, on CIFAR100 across varying $\rho$ and $\eta$. The first three columns average over $\eta$ for each $\rho$; the last column averages over all configurations. Refer to Figure \ref{['fig:intro_1']} for the detailed results.
  • Figure 3: Comparison between Model Soups and LT-Soups. (a) Model Soups merges models fine-tuned on full, severely imbalanced training data. (b) LT-Soups merges models fine-tuned on subsets with increasingly higher imbalance ratios to preserve pretrained features while adapting to class distribution shifts.
  • Figure 4: Average performance across 5 LT benchmarks.
  • Figure 5: Performance and weight change comparison across different stages of LT-Soups.
  • ...and 2 more figures

Theorems & Definitions (1)

  • Definition 1