Table of Contents
Fetching ...

A Distributionally Robust Optimisation Approach to Fair Credit Scoring

Pablo Casas, Christophe Mues, Huan Yu

TL;DR

This paper tackles fairness in credit scoring under distributional shifts by applying Distributionally Robust Optimisation (DRO) with a convex fairness penalty. By leveraging Wasserstein ambiguity sets and a ground metric that incorporates the sensitive attribute, the authors define and compare LR, LRL2, FLR, DRLR, and DRFLR. They show that DRO-based methods, particularly DRFLR and DRLR, improve a threshold-independent fairness measure (LEO) with little to no sacrifice in predictive accuracy, albeit at higher computational cost. The work suggests a meaningful link between robustness and fairness and highlights the need for scalable implementations and unified fairness metrics for practical deployment in credit scoring contexts.

Abstract

Credit scoring has been catalogued by the European Commission and the Executive Office of the US President as a high-risk classification task, a key concern being the potential harms of making loan approval decisions based on models that would be biased against certain groups. To address this concern, recent credit scoring research has considered a range of fairness-enhancing techniques put forward by the machine learning community to reduce bias and unfair treatment in classification systems. While the definition of fairness or the approach they follow to impose it may vary, most of these techniques, however, disregard the robustness of the results. This can create situations where unfair treatment is effectively corrected in the training set, but when producing out-of-sample classifications, unfair treatment is incurred again. Instead, in this paper, we will investigate how to apply Distributionally Robust Optimisation (DRO) methods to credit scoring, thereby empirically evaluating how they perform in terms of fairness, ability to classify correctly, and the robustness of the solution against changes in the marginal proportions. In so doing, we find DRO methods to provide a substantial improvement in terms of fairness, with almost no loss in performance. These results thus indicate that DRO can improve fairness in credit scoring, provided that further advances are made in efficiently implementing these systems. In addition, our analysis suggests that many of the commonly used fairness metrics are unsuitable for a credit scoring setting, as they depend on the choice of classification threshold.

A Distributionally Robust Optimisation Approach to Fair Credit Scoring

TL;DR

This paper tackles fairness in credit scoring under distributional shifts by applying Distributionally Robust Optimisation (DRO) with a convex fairness penalty. By leveraging Wasserstein ambiguity sets and a ground metric that incorporates the sensitive attribute, the authors define and compare LR, LRL2, FLR, DRLR, and DRFLR. They show that DRO-based methods, particularly DRFLR and DRLR, improve a threshold-independent fairness measure (LEO) with little to no sacrifice in predictive accuracy, albeit at higher computational cost. The work suggests a meaningful link between robustness and fairness and highlights the need for scalable implementations and unified fairness metrics for practical deployment in credit scoring contexts.

Abstract

Credit scoring has been catalogued by the European Commission and the Executive Office of the US President as a high-risk classification task, a key concern being the potential harms of making loan approval decisions based on models that would be biased against certain groups. To address this concern, recent credit scoring research has considered a range of fairness-enhancing techniques put forward by the machine learning community to reduce bias and unfair treatment in classification systems. While the definition of fairness or the approach they follow to impose it may vary, most of these techniques, however, disregard the robustness of the results. This can create situations where unfair treatment is effectively corrected in the training set, but when producing out-of-sample classifications, unfair treatment is incurred again. Instead, in this paper, we will investigate how to apply Distributionally Robust Optimisation (DRO) methods to credit scoring, thereby empirically evaluating how they perform in terms of fairness, ability to classify correctly, and the robustness of the solution against changes in the marginal proportions. In so doing, we find DRO methods to provide a substantial improvement in terms of fairness, with almost no loss in performance. These results thus indicate that DRO can improve fairness in credit scoring, provided that further advances are made in efficiently implementing these systems. In addition, our analysis suggests that many of the commonly used fairness metrics are unsuitable for a credit scoring setting, as they depend on the choice of classification threshold.
Paper Structure (14 sections, 17 equations, 12 figures, 2 tables)

This paper contains 14 sections, 17 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Evolution of SP with changes in acceptance threshold (GC) $\rho=0.01$; $\eta=0.04$; $\kappa_y= \kappa_s=0.2$
  • Figure 2: Evolution of ROC and LEO with changes in $\eta$ and $\rho$ (GC)
  • Figure 3: Evolution of ROC and LEO with changes in $\kappa_y$ and $\kappa_s$ (GC)
  • Figure 4: Evolution of ROC LEO and SP with changes in marginal distribution (GC)
  • Figure 5: Evolution of most relevant coefficients and PD with changes in eta
  • ...and 7 more figures