Table of Contents
Fetching ...

FairSense: Long-Term Fairness Analysis of ML-Enabled Systems

Yining She, Sumon Biswas, Christian Kästner, Eunsuk Kang

TL;DR

The paper tackles the challenge of long-term fairness in ML-enabled systems, where feedback between decisions and the environment can cause disparities to emerge over time. It introduces FairSense, a simulation-based, design-time framework that models a system-environment feedback loop, runs Monte-Carlo traces across configurations, and uses sensitivity analysis to identify high-impact design choices while enabling utility-fairness trade-offs. The authors demonstrate FairSense on three real-world case studies (Loan Lending, Opioid Risk Scoring, Predictive Policing), show that only a subset of parameters significantly drive long-term fairness, and illustrate Pareto-optimal configurations balancing fairness and utility. They also introduce a regression-based sensitivity analysis and a covering-array sampling heuristic to scale the exploration and provide practical guidance for system designers. The work advances practical, environment-aware analysis of fairness, offering actionable insights for mitigating long-term unfairness in deployed ML-enabled systems.

Abstract

Algorithmic fairness of machine learning (ML) models has raised significant concern in the recent years. Many testing, verification, and bias mitigation techniques have been proposed to identify and reduce fairness issues in ML models. The existing methods are model-centric and designed to detect fairness issues under static settings. However, many ML-enabled systems operate in a dynamic environment where the predictive decisions made by the system impact the environment, which in turn affects future decision-making. Such a self-reinforcing feedback loop can cause fairness violations in the long term, even if the immediate outcomes are fair. In this paper, we propose a simulation-based framework called FairSense to detect and analyze long-term unfairness in ML-enabled systems. Given a fairness requirement, FairSense performs Monte-Carlo simulation to enumerate evolution traces for each system configuration. Then, FairSense performs sensitivity analysis on the space of possible configurations to understand the impact of design options and environmental factors on the long-term fairness of the system. We demonstrate FairSense's potential utility through three real-world case studies: Loan lending, opioids risk scoring, and predictive policing.

FairSense: Long-Term Fairness Analysis of ML-Enabled Systems

TL;DR

The paper tackles the challenge of long-term fairness in ML-enabled systems, where feedback between decisions and the environment can cause disparities to emerge over time. It introduces FairSense, a simulation-based, design-time framework that models a system-environment feedback loop, runs Monte-Carlo traces across configurations, and uses sensitivity analysis to identify high-impact design choices while enabling utility-fairness trade-offs. The authors demonstrate FairSense on three real-world case studies (Loan Lending, Opioid Risk Scoring, Predictive Policing), show that only a subset of parameters significantly drive long-term fairness, and illustrate Pareto-optimal configurations balancing fairness and utility. They also introduce a regression-based sensitivity analysis and a covering-array sampling heuristic to scale the exploration and provide practical guidance for system designers. The work advances practical, environment-aware analysis of fairness, offering actionable insights for mitigating long-term unfairness in deployed ML-enabled systems.

Abstract

Algorithmic fairness of machine learning (ML) models has raised significant concern in the recent years. Many testing, verification, and bias mitigation techniques have been proposed to identify and reduce fairness issues in ML models. The existing methods are model-centric and designed to detect fairness issues under static settings. However, many ML-enabled systems operate in a dynamic environment where the predictive decisions made by the system impact the environment, which in turn affects future decision-making. Such a self-reinforcing feedback loop can cause fairness violations in the long term, even if the immediate outcomes are fair. In this paper, we propose a simulation-based framework called FairSense to detect and analyze long-term unfairness in ML-enabled systems. Given a fairness requirement, FairSense performs Monte-Carlo simulation to enumerate evolution traces for each system configuration. Then, FairSense performs sensitivity analysis on the space of possible configurations to understand the impact of design options and environmental factors on the long-term fairness of the system. We demonstrate FairSense's potential utility through three real-world case studies: Loan lending, opioids risk scoring, and predictive policing.
Paper Structure (30 sections, 1 equation, 5 figures, 5 tables)

This paper contains 30 sections, 1 equation, 5 figures, 5 tables.

Figures (5)

  • Figure 1: An overview of the FairSense approach
  • Figure 2: A feedback loop created by ML-enabled loan lending system
  • Figure 3: Feedback loop model of ML-enabled system
  • Figure 4: An evolution trace of loan lending system showing long-term unfairness.
  • Figure 5: The radar plots visualizing trade-offs in three Pareto-optimal configurations for each case study. All values were scaled to [0,1]. A higher value implies better performance.