Toward Fairness via Maximum Mean Discrepancy Regularization on Logits Space
Hao-Wei Chung, Ching-Hao Chiu, Yu-Jen Chen, Yiyu Shi, Tsung-Yi Ho
TL;DR
This work tackles fairness in high-risk computer-vision tasks by addressing Equalized Odds through a logits-space regularizer. The authors introduce Logits-MMD, which minimizes the Maximum Mean Discrepancy between the logit distributions of different sensitive groups for each class, integrated with standard cross-entropy loss as $\min_{\Theta} L_{CE}(\Theta) + \lambda L_{MMD}(\Theta)$. They argue that prior logits-space methods (Gaussian Assumption and Histogram Approximation) impose distributional priors that misalign with EO, and demonstrate that MMD provides a principled, threshold-free alignment via RKHS with a Gaussian kernel. Empirically, Logits-MMD achieves state-of-the-art equalized odds performance on CelebA and UTK Face, and generalizes to bias scenarios in Dogs and Cats, confirming its robustness and practical impact for fair facial attribute classification. The approach offers a principled, scalable path to fair predictions without onerous distributional assumptions, with broad applicability to multi-attribute fairness in vision systems.
Abstract
Fairness has become increasingly pivotal in machine learning for high-risk applications such as machine learning in healthcare and facial recognition. However, we see the deficiency in the previous logits space constraint methods. Therefore, we propose a novel framework, Logits-MMD, that achieves the fairness condition by imposing constraints on output logits with Maximum Mean Discrepancy. Moreover, quantitative analysis and experimental results show that our framework has a better property that outperforms previous methods and achieves state-of-the-art on two facial recognition datasets and one animal dataset. Finally, we show experimental results and demonstrate that our debias approach achieves the fairness condition effectively.
