The Benefits and Risks of Transductive Approaches for AI Fairness
Muhammed Razzak, Andreas Kirsch, Yarin Gal
TL;DR
Transductive learning uses a holdout set to guide training, but the holdout's sensitive-group composition can bias fairness outcomes. The authors study two transductive methods, RHOS-Loss and FairGen, on CIFAR100-20 and CelebA to assess how holdout balance affects discriminative accuracy, fairness metrics, and generative quality. They find that imbalanced holdouts exacerbate disparities while balanced holdouts can mitigate biases across both discriminative and generative tasks, underscoring the practical impact of holdout design. The results emphasize the need for diverse, representative holdout sets and motivate developing strategies to construct or adapt holdouts to deployment distributions for robust, fair transductive learning.
Abstract
Recently, transductive learning methods, which leverage holdout sets during training, have gained popularity for their potential to improve speed, accuracy, and fairness in machine learning models. Despite this, the composition of the holdout set itself, particularly the balance of sensitive sub-groups, has been largely overlooked. Our experiments on CIFAR and CelebA datasets show that compositional changes in the holdout set can substantially influence fairness metrics. Imbalanced holdout sets exacerbate existing disparities, while balanced holdouts can mitigate issues introduced by imbalanced training data. These findings underline the necessity of constructing holdout sets that are both diverse and representative.
