On The Fairness Impacts of Hardware Selection in Machine Learning
Sree Harsha Nelaturu, Nishaanth Kanna Ravichandran, Cuong Tran, Sara Hooker, Ferdinando Fioretto
TL;DR
The paper addresses how hardware tooling affects fairness in ML, introducing hardware sensitivity $\Delta(a,m)$ and fairness violation $\xi(D,m)$ to quantify disparities across demographic groups. It develops a theoretical framework where hardware-induced unfairness arises from differences in group gradient flows and group Hessian-based loss landscapes, and validates these insights with extensive experiments across GPUs, datasets, and architectures. A practical mitigation is proposed: augmenting training loss with a term that aligns distance-to-decision-boundary across groups, which substantially reduces fairness violations while preserving overall performance. The work highlights that deployment hardware can alter model equity and provides actionable guidelines to evaluate and mitigate these effects, urging careful cross-hardware reporting and robust training strategies for fair ML in heterogeneous hardware environments.
Abstract
In the machine learning ecosystem, hardware selection is often regarded as a mere utility, overshadowed by the spotlight on algorithms and data. This oversight is particularly problematic in contexts like ML-as-a-service platforms, where users often lack control over the hardware used for model deployment. How does the choice of hardware impact generalization properties? This paper investigates the influence of hardware on the delicate balance between model performance and fairness. We demonstrate that hardware choices can exacerbate existing disparities, attributing these discrepancies to variations in gradient flows and loss surfaces across different demographic groups. Through both theoretical and empirical analysis, the paper not only identifies the underlying factors but also proposes an effective strategy for mitigating hardware-induced performance imbalances.
