Data Acquisition for Improving Model Fairness using Reinforcement Learning
Jahid Hasan, Romila Pradhan
TL;DR
This work tackles improving classifier fairness under a data-budget constraint by proposing DataSift, a framework that treats data acquisition as a multi-armed bandit problem over partitions of the data pool and uses a UCB strategy to select batches. To accelerate gains, DataSift-Inf integrates data valuation via influence functions to construct high-impact batches within the chosen partition, enabling faster improvements in the fairness metric $\mathcal{F}$ while maintaining accuracy. Empirical results across six real-world and synthetic datasets show that DataSift commonly outperforms baselines (Random, Entropy, AutoData, Inf) in achieving fairness with fewer acquired points, and that DataSift-Inf provides further efficiency and robustness, including model-agnostic applicability. The paper also analyzes hyperparameter settings, partitioning strategies, and scalability, and outlines future work on non-parametric models, alternative RL methods, and cost-aware acquisition.
Abstract
Machine learning systems are increasingly being used in critical decision making such as healthcare, finance, and criminal justice. Concerns around their fairness have resulted in several bias mitigation techniques that emphasize the need for high-quality data to ensure fairer decisions. However, the role of earlier stages of machine learning pipelines in mitigating model bias has not been explored well. In this paper, we focus on the task of acquiring additional labeled data points for training the downstream machine learning model to rapidly improve its fairness. Since not all data points in a data pool are equally beneficial to the task of fairness, we generate an ordering in which data points should be acquired. We present DataSift, a data acquisition framework based on the idea of data valuation that relies on partitioning and multi-armed bandits to determine the most valuable data points to acquire. Over several iterations, DataSift selects a partition and randomly samples a batch of data points from the selected partition, evaluates the benefit of acquiring the batch on model fairness, and updates the utility of partitions depending on the benefit. To further improve the effectiveness and efficiency of evaluating batches, we leverage influence functions that estimate the effect of acquiring a batch without retraining the model. We empirically evaluate DataSift on several real-world and synthetic datasets and show that the fairness of a machine learning model can be significantly improved even while acquiring a few data points.
