RISE: Interactive Visual Diagnosis of Fairness in Machine Learning Models
Ray Chen, Christan Grant
TL;DR
RISE tackles the challenge of assessing fairness under domain shift where scalar metrics fail to reveal where disparities arise. It proposes an interactive visualization that sorts prediction residuals into a residual curve, linking patterns to formal fairness notions through knee detection and three indicators: $\mathcal{F}_{\text{mean}}$, $\mathcal{F}_{\text{shift}}$, and $\mathcal{F}_{\text{acc}}$. This approach enables localized disparity diagnosis, cross-environment subgroup analysis, and exposure of accuracy–fairness trade-offs that metrics miss, supporting more informed model selection and deployment. Demonstrations on the BDD100K driving dataset show that RISE can reveal localized biases even when aggregate metrics look favorable, guiding practitioners toward balanced, fairer systems. Overall, RISE provides a perception-informed, post-hoc diagnostic interface that complements existing fairness toolkits and facilitates actionable model analysis across modalities.
Abstract
Evaluating fairness under domain shift is challenging because scalar metrics often obscure exactly where and how disparities arise. We introduce \textit{RISE} (Residual Inspection through Sorted Evaluation), an interactive visualization tool that converts sorted residuals into interpretable patterns. By connecting residual curve structures to formal fairness notions, RISE enables localized disparity diagnosis, subgroup comparison across environments, and the detection of hidden fairness issues. Through post-hoc analysis, RISE exposes accuracy-fairness trade-offs that aggregate statistics miss, supporting more informed model selection.
