Adaptive Group Robust Ensemble Knowledge Distillation
Patrik Kenfack, Ulrich Aïvodji, Samira Ebrahimi Kahou
TL;DR
This paper addresses the problem that ensemble knowledge distillation can worsen worst-group performance when distilling from biased teachers. It introduces Adaptive Group Robust Ensemble Knowledge Distillation (AGRE-KD), which uses a biased model to steer gradient-space weighting and selectively emphasize teachers whose gradient directions diverge from the biased model, thereby reducing reliance on spurious correlations in the student. Across synthetic and real-world benchmarks, AGRE-KD consistently improves worst-group accuracy, often outperforming standard deep ensembles and other KD baselines, and remains robust across varying numbers of debiased teachers and architectural heterogeneity. The work demonstrates a practical, gradient-based, unsupervised strategy to improve subgroup fairness in distillation, with potential impact on deploying compact yet robust models on edge devices.
Abstract
Neural networks can learn spurious correlations in the data, often leading to performance degradation for underrepresented subgroups. Studies have demonstrated that the disparity is amplified when knowledge is distilled from a complex teacher model to a relatively ``simple'' student model. Prior work has shown that ensemble deep learning methods can improve the performance of the worst-case subgroups; however, it is unclear if this advantage carries over when distilling knowledge from an ensemble of teachers, especially when the teacher models are debiased. This study demonstrates that traditional ensemble knowledge distillation can significantly drop the performance of the worst-case subgroups in the distilled student model even when the teacher models are debiased. To overcome this, we propose Adaptive Group Robust Ensemble Knowledge Distillation (AGRE-KD), a simple ensembling strategy to ensure that the student model receives knowledge beneficial for unknown underrepresented subgroups. Leveraging an additional biased model, our method selectively chooses teachers whose knowledge would better improve the worst-performing subgroups by upweighting the teachers with gradient directions deviating from the biased model. Our experiments on several datasets demonstrate the superiority of the proposed ensemble distillation technique and show that it can even outperform classic model ensembles based on majority voting. Our source code is available at https://github.com/patrikken/AGRE-KD
