Fair MP-BOOST: Fair and Interpretable Minipatch Boosting
Camille Olivia Little, Genevera I. Allen
TL;DR
Fair MP-Boost addresses the need for boosting methods that are simultaneously accurate, fair, and interpretable for tabular data. It introduces a double stochastic gradient boosting framework that adaptively learns minipatch distributions over observations and features, controlled by the tradeoff parameter $\alpha$ and guided by $\mathcal{L}_A$ and $\mathcal{L}_F$ to balance accuracy and fairness. The method yields intrinsic interpretability through averaged sampling probabilities and feature importance metrics such as TreeFIS and FairTreeFIS, and uses out-of-patch validation for early stopping and tuning. Empirical results on simulated data and real benchmarks (Adult and Law School) show that Fair MP-Boost can outperform competing bias-mitigation approaches in fairness while maintaining competitive accuracy, offering a practical, interpretable alternative for fair boosting in high-stakes domains.
Abstract
Ensemble methods, particularly boosting, have established themselves as highly effective and widely embraced machine learning techniques for tabular data. In this paper, we aim to leverage the robust predictive power of traditional boosting methods while enhancing fairness and interpretability. To achieve this, we develop Fair MP-Boost, a stochastic boosting scheme that balances fairness and accuracy by adaptively learning features and observations during training. Specifically, Fair MP-Boost sequentially samples small subsets of observations and features, termed minipatches (MP), according to adaptively learned feature and observation sampling probabilities. We devise these probabilities by combining loss functions, or by combining feature importance scores to address accuracy and fairness simultaneously. Hence, Fair MP-Boost prioritizes important and fair features along with challenging instances, to select the most relevant minipatches for learning. The learned probability distributions also yield intrinsic interpretations of feature importance and important observations in Fair MP-Boost. Through empirical evaluation of simulated and benchmark datasets, we showcase the interpretability, accuracy, and fairness of Fair MP-Boost.
