Are demographically invariant models and representations in medical imaging fair?
Eike Petersen, Enzo Ferrante, Melanie Ganz, Aasa Feragen
TL;DR
This paper questions whether enforcing demographically invariant representations is desirable in medical imaging. It analyzes marginal invariance (leading to statistical parity) and class-conditional invariance (leading to separation/equalized odds), along with counterfactual invariance, highlighting their trade-offs, particularly when disease prevalence differs across groups. The authors argue that both invariance types can harm predictive performance and calibration and may not guarantee fair treatment, while counterfactual approaches face substantial definitional challenges in medical imaging. They conclude that encoding demographic attributes is not inherently unfair and can even be advantageous for learning task-relevant, physiology-based encodings, urging comprehensive subgroup fairness assessments and practical mitigation strategies rather than strict invariance.
Abstract
Medical imaging models have been shown to encode information about patient demographics such as age, race, and sex in their latent representation, raising concerns about their potential for discrimination. Here, we ask whether requiring models not to encode demographic attributes is desirable. We point out that marginal and class-conditional representation invariance imply the standard group fairness notions of demographic parity and equalized odds, respectively. In addition, however, they require matching the risk distributions, thus potentially equalizing away important group differences. Enforcing the traditional fairness notions directly instead does not entail these strong constraints. Moreover, representationally invariant models may still take demographic attributes into account for deriving predictions, implying unequal treatment - in fact, achieving representation invariance may require doing so. In theory, this can be prevented using counterfactual notions of (individual) fairness or invariance. We caution, however, that properly defining medical image counterfactuals with respect to demographic attributes is fraught with challenges. Finally, we posit that encoding demographic attributes may even be advantageous if it enables learning a task-specific encoding of demographic features that does not rely on social constructs such as 'race' and 'gender.' We conclude that demographically invariant representations are neither necessary nor sufficient for fairness in medical imaging. Models may need to encode demographic attributes, lending further urgency to calls for comprehensive model fairness assessments in terms of predictive performance across diverse patient groups.
