NeuFair: Neural Network Fairness Repair with Dropout
Vishnu Asutosh Dasu, Ashish Kumar, Saeid Tizpaz-Niari, Gang Tan
TL;DR
NeuFair tackles unfairness in pre-trained DNNs by applying inference-time neuron dropout as a post-processing bias mitigation. It formulates the dropout selection as a combinatorial optimization problem and solves it with two randomized search strategies, Simulated Annealing and Random Walk, guided by a cost function that couples Equalized Odds Difference ($EOD$) with a F1-based penalty to preserve utility. Empirical results across seven benchmarks and five social-critical tasks show up to a 69% reduction in $EOD$ with minimal or acceptable loss in $F1$, and NeuFair generally outperforms a state-of-the-art post-processing method Dice. The work contributes a practical, open-source framework for post hoc fairness repair that can operate without modifying training data or retraining, enabling deployment-time fairness improvements with controllable utility trade-offs. It also provides insights into hyperparameter effects and the relative strengths of SA versus RW in navigating the neuron-dropout search space.
Abstract
This paper investigates neuron dropout as a post-processing bias mitigation for deep neural networks (DNNs). Neural-driven software solutions are increasingly applied in socially critical domains with significant fairness implications. While neural networks are exceptionally good at finding statistical patterns from data, they may encode and amplify existing biases from the historical data. Existing bias mitigation algorithms often require modifying the input dataset or the learning algorithms. We posit that the prevalent dropout methods that prevent over-fitting during training by randomly dropping neurons may be an effective and less intrusive approach to improve the fairness of pre-trained DNNs. However, finding the ideal set of neurons to drop is a combinatorial problem. We propose NeuFair, a family of post-processing randomized algorithms that mitigate unfairness in pre-trained DNNs via dropouts during inference after training. Our randomized search is guided by an objective to minimize discrimination while maintaining the model's utility. We show that our design of randomized algorithms is effective and efficient in improving fairness (up to 69%) with minimal or no model performance degradation. We provide intuitive explanations of these phenomena and carefully examine the influence of various hyperparameters of search algorithms on the results. Finally, we empirically and conceptually compare NeuFair to different state-of-the-art bias mitigators.
