Compressing Model with Few Class-Imbalance Samples: An Out-of-Distribution Expedition
Tian-Shuang Wu, Shen-Huan Lyu, Ning Chen, Zhihao Qu, Baoliu Ye
TL;DR
The paper tackles the underexplored issue of class imbalance in few-shot model compression by introducing OE-FSMC, a framework that leverages out-of-distribution (OOD) data to rebalance training during both compression and fine-tuning. It combines a complementary OOD label assignment, class-aware pruning, a joint distillation loss, and a regularization term to prevent overfitting to OOD samples, with early stopping to preserve generalization. Empirical results across CIFAR-10/100 and ILSVRC-2012 show that OE-FSMC improves accuracy for multiple base methods and architectures, especially when data are severely limited, demonstrating the method's generality and practical impact. The work contributes a practical strategy for mitigating minority-class degradation in few-shot compression and opens avenues for applying OOD-assisted rebalancing to broader compression paradigms, including potential extensions to quantization.
Abstract
In recent years, as a compromise between privacy and performance, few-sample model compression has been widely adopted to deal with limited data resulting from privacy and security concerns. However, when the number of available samples is extremely limited, class imbalance becomes a common and tricky problem. Achieving an equal number of samples across all classes is often costly and impractical in real-world applications, and previous studies on few-sample model compression have mostly ignored this significant issue. Our experiments comprehensively demonstrate that class imbalance negatively affects the overall performance of few-sample model compression methods. To address this problem, we propose a novel and adaptive framework named OOD-Enhanced Few-Sample Model Compression (OE-FSMC). This framework integrates easily accessible out-of-distribution (OOD) data into both the compression and fine-tuning processes, effectively rebalancing the training distribution. We also incorporate a joint distillation loss and a regularization term to reduce the risk of the model overfitting to the OOD data. Extensive experiments on multiple benchmark datasets show that our framework can be seamlessly incorporated into existing few-sample model compression methods, effectively mitigating the accuracy degradation caused by class imbalance.
