Perceptions of the Fairness Impacts of Multiplicity in Machine Learning
Anna P. Meyer, Yea-Seul Kim, Aws Albarghouthi, Loris D'Antoni
TL;DR
This paper investigates whether lay stakeholders perceive multiplicity in ML as a fairness risk and how they prefer to resolve it. Using a two-part survey with an educational component and a conjoint-style analysis, it finds that multiplicity does not broadly erode perceived fairness, though participants dislike ignoring multiplicity or randomization and show a clear preference for human-in-the-loop or more sophisticated resolution methods. Preferences vary with task stakes and framing, suggesting that practical ML deployment should tailor multiplicity-handling mechanisms to each context. The study highlights a gap between philosophical arguments for randomization and lay expectations, and it calls for greater transparency and stakeholder-aligned design in ML systems exhibiting multiplicity.
Abstract
Machine learning (ML) is increasingly used in high-stakes settings, yet multiplicity - the existence of multiple good models - means that some predictions are essentially arbitrary. ML researchers and philosophers posit that multiplicity poses a fairness risk, but no studies have investigated whether stakeholders agree. In this work, we conduct a survey to see how multiplicity impacts lay stakeholders' - i.e., decision subjects' - perceptions of ML fairness, and which approaches to address multiplicity they prefer. We investigate how these perceptions are modulated by task characteristics (e.g., stakes and uncertainty). Survey respondents think that multiplicity threatens the fairness of model outcomes, but not the appropriateness of using the model, even though existing work suggests the opposite. Participants are strongly against resolving multiplicity by using a single model (effectively ignoring multiplicity) or by randomizing the outcomes. Our results indicate that model developers should be intentional about dealing with multiplicity in order to maintain fairness.
