Table of Contents
Fetching ...

Non-Uniform Class-Wise Coreset Selection for Vision Model Fine-tuning

Hanyu Zhang, Zhen Xing, Ruian He, Wenxuan Yang, Chenxi Ma, Weimin Tan, Bo Yan

TL;DR

This work tackles the ineffectiveness of class-agnostic coreset pruning in fine-tuning large vision models by introducing Non-Uniform Class-Wise Coreset Selection (NUCS). NUCS first computes a global class difficulty S_j via the winsorized average of per-sample difficulty scores and allocates the coreset budget non-uniformly across classes using $b_j = (1-\alpha) \mathbf{S_j} N_j / T$, with normalization $T = (\sum_i \mathbf{S_i} N_i)/N$; it then selects intra-class samples within a difficulty window, whose endpoint k is predicted through linear ridge regression on learned features. A theoretical toy-model demonstrates that class budgets should be proportional to class difficulty, with boundary cases saturating on the hardest class, supporting the non-uniform allocation strategy. Empirically, NUCS and its grid-search upper-bound NUCS-O achieve state-of-the-art or near-state-of-the-art performance across 10 diverse datasets and pretrained backbones, with notable accuracy gains and substantial efficiency improvements, including strong cross-domain results (e.g., NIH ChestX-ray14). These results reinforce that non-uniform class-wise coreset selection is a practical and scalable approach to enable efficient fine-tuning of large vision foundation models.

Abstract

Coreset selection aims to identify a small yet highly informative subset of data, thereby enabling more efficient model training while reducing storage overhead. Recently, this capability has been leveraged to tackle the challenges of fine-tuning large foundation models, offering a direct pathway to their efficient and practical deployment. However, most existing methods are class-agnostic, causing them to overlook significant difficulty variations among classes. This leads them to disproportionately prune samples from either overly easy or hard classes, resulting in a suboptimal allocation of the data budget that ultimately degrades the final coreset performance. To address this limitation, we propose Non-Uniform Class-Wise Coreset Selection (NUCS), a novel framework that both integrates class-level and sample-level difficulty. We propose a robust metric for global class difficulty, quantified as the winsorized average of per-sample difficulty scores. Guided by this metric, our method performs a theoretically-grounded, non-uniform allocation of data selection budgets inter-class, while adaptively selecting samples intra-class with optimal difficulty ranges. Extensive experiments on a wide range of visual classification tasks demonstrate that NUCS consistently outperforms state-of-the-art methods across 10 diverse datasets and pre-trained models, achieving both superior accuracy and computational efficiency, highlighting the promise of non-uniform class-wise selection strategy for advancing the efficient fine-tuning of large foundation models.

Non-Uniform Class-Wise Coreset Selection for Vision Model Fine-tuning

TL;DR

This work tackles the ineffectiveness of class-agnostic coreset pruning in fine-tuning large vision models by introducing Non-Uniform Class-Wise Coreset Selection (NUCS). NUCS first computes a global class difficulty S_j via the winsorized average of per-sample difficulty scores and allocates the coreset budget non-uniformly across classes using , with normalization ; it then selects intra-class samples within a difficulty window, whose endpoint k is predicted through linear ridge regression on learned features. A theoretical toy-model demonstrates that class budgets should be proportional to class difficulty, with boundary cases saturating on the hardest class, supporting the non-uniform allocation strategy. Empirically, NUCS and its grid-search upper-bound NUCS-O achieve state-of-the-art or near-state-of-the-art performance across 10 diverse datasets and pretrained backbones, with notable accuracy gains and substantial efficiency improvements, including strong cross-domain results (e.g., NIH ChestX-ray14). These results reinforce that non-uniform class-wise coreset selection is a practical and scalable approach to enable efficient fine-tuning of large vision foundation models.

Abstract

Coreset selection aims to identify a small yet highly informative subset of data, thereby enabling more efficient model training while reducing storage overhead. Recently, this capability has been leveraged to tackle the challenges of fine-tuning large foundation models, offering a direct pathway to their efficient and practical deployment. However, most existing methods are class-agnostic, causing them to overlook significant difficulty variations among classes. This leads them to disproportionately prune samples from either overly easy or hard classes, resulting in a suboptimal allocation of the data budget that ultimately degrades the final coreset performance. To address this limitation, we propose Non-Uniform Class-Wise Coreset Selection (NUCS), a novel framework that both integrates class-level and sample-level difficulty. We propose a robust metric for global class difficulty, quantified as the winsorized average of per-sample difficulty scores. Guided by this metric, our method performs a theoretically-grounded, non-uniform allocation of data selection budgets inter-class, while adaptively selecting samples intra-class with optimal difficulty ranges. Extensive experiments on a wide range of visual classification tasks demonstrate that NUCS consistently outperforms state-of-the-art methods across 10 diverse datasets and pre-trained models, achieving both superior accuracy and computational efficiency, highlighting the promise of non-uniform class-wise selection strategy for advancing the efficient fine-tuning of large foundation models.

Paper Structure

This paper contains 25 sections, 1 theorem, 15 equations, 6 figures, 8 tables, 1 algorithm.

Key Result

Proposition 1

A non-uniform allocation strategy that assigns a larger portion of the budget to classes with higher global difficulty $\mathbf{S}_j$ yields a more effective coreset.

Figures (6)

  • Figure 1: A comparison of our method with conventional class-agnostic method. (a) Class-agnostic selection treats all samples as a single pool. (b) Our proposed NUCS performs a non-uniform, class-wise selection by choosing an appropriate number of samples within a suitable difficulty range for each class.
  • Figure 2: Comparison of EL2N score distributions between the entire dataset (gray) and two individual classes (colored) on (a) Food101 and (b) CIFAR100-LT. For both datasets, the distributions of individual classes show significant shifts compared to the global data distribution, highlighting the heterogeneity of class difficulty. The model is a ResNet18 fine-tuned from ImageNet-1K.
  • Figure 3: The overview of NUCS. In model fine-tuning, NUCS a) allocates non-uniform selection budgets based on global class difficulty b) automatically select appropriate difficulty-ranged samples in each class according to allocated budget. The workflow is illustrated here using three representative classes from the Food101 dataset, in the context of fine-tuning a ResNet18 model pre-trained on ImageNet-1K.
  • Figure 4: Performance comparison between our methods and other baselines. Experimental results demonstrate consistent and significant improvements across various datasets and pre-trained models. To account for the superior robustness of pretrained ViT model against data pruning, we evaluate them under a more challenging set of higher pruning rates.
  • Figure 5: Ablation study comparing NUCS-O with its uniform budget allocation variant NUCS-O (uniform) on Food101 (a-b) and CIFAR100-LT (c-d) with ResNet18/ViT-L backbones.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Proposition 1