OxEnsemble: Fair Ensembles for Low-Data Classification
Jonathan Rystrøm, Zihao Fu, Chris Russell
TL;DR
OxEnsemble tackles fair classification in scarce-data regimes by training an ensemble of deep networks with per-member fairness constraints on held-out data and aggregating via majority voting. The authors prove theoretical guarantees for minimum-rate and error-parity fairness under restricted competence and provide data-size guidance for observing these guarantees in practice. Empirically, OxEnsemble delivers superior fairness–accuracy trade-offs across three medical-imaging datasets (HAM10000, Fitzpatrick17k, FairVLMed) with efficiency benefits from a shared backbone. This work offers a practically impactful, theoretically grounded path to equitable decision-making in high-stakes, data-scarce domains. Code is available at the provided repository link.
Abstract
We address the problem of fair classification in settings where data is scarce and unbalanced across demographic groups. Such low-data regimes are common in domains like medical imaging, where false negatives can have fatal consequences. We propose a novel approach \emph{OxEnsemble} for efficiently training ensembles and enforcing fairness in these low-data regimes. Unlike other approaches, we aggregate predictions across ensemble members, each trained to satisfy fairness constraints. By construction, \emph{OxEnsemble} is both data-efficient, carefully reusing held-out data to enforce fairness reliably, and compute-efficient, requiring little more compute than used to fine-tune or evaluate an existing model. We validate this approach with new theoretical guarantees. Experimentally, our approach yields more consistent outcomes and stronger fairness-accuracy trade-offs than existing methods across multiple challenging medical imaging classification datasets.
