FairContrast: Enhancing Fairness through Contrastive learning and Customized Augmenting Methods on Tabular Data
Aida Tayebi, Ali Khodabandeh Yalabadi, Mehdi Yazdani-Jahromi, Ozlem Ozmen Garibay
TL;DR
The paper tackles bias in tabular-data models by introducing FairContrast, a fairness-aware contrastive-learning framework that employs a specialized positive-pair sampling strategy and a hybrid loss combining supervised/self-supervised contrastive objectives with binary cross-entropy. The theoretical analysis shows that the approach implicitly balance label-relevant information against leakage of sensitive attributes, enabling a data-driven fairness-utility trade-off without extra adversaries or estimators. Empirically, FairContrast delivers reduced bias (lower Demographic Parity) with minimal accuracy loss across three datasets (Adult, German, Heritage Health) in both supervised and unsupervised modes, outperforming several state-of-the-art tabular fairness baselines. The work highlights the potential of contrastive learning for fair representations in tabular domains and points to avenues for extending to other fairness notions and data modalities.
Abstract
As AI systems become more embedded in everyday life, the development of fair and unbiased models becomes more critical. Considering the social impact of AI systems is not merely a technical challenge but a moral imperative. As evidenced in numerous research studies, learning fair and robust representations has proven to be a powerful approach to effectively debiasing algorithms and improving fairness while maintaining essential information for prediction tasks. Representation learning frameworks, particularly those that utilize self-supervised and contrastive learning, have demonstrated superior robustness and generalizability across various domains. Despite the growing interest in applying these approaches to tabular data, the issue of fairness in these learned representations remains underexplored. In this study, we introduce a contrastive learning framework specifically designed to address bias and learn fair representations in tabular datasets. By strategically selecting positive pair samples and employing supervised and self-supervised contrastive learning, we significantly reduce bias compared to existing state-of-the-art contrastive learning models for tabular data. Our results demonstrate the efficacy of our approach in mitigating bias with minimum trade-off in accuracy and leveraging the learned fair representations in various downstream tasks.
