Table of Contents
Fetching ...

Improving Fairness in Credit Lending Models using Subgroup Threshold Optimization

Cecilia Ying, Stephen Thomas

TL;DR

This work tackles fairness in credit lending predictions by addressing how excluding sensitive attributes can paradoxically increase unfairness due to historical bias. It introduces Subgroup Threshold Optimizer (STO), a post-processing method that assigns a distinct decision threshold $\tau_i$ to each subgroup, aiming to minimize the discrimination score $D_{ au_i,\tau_j}$ while preserving or improving the overall PPV expressed as $PPV_{ au,i}$. Through two experiments on the Home Credit dataset, STO achieved large reductions in discrimination (over $90\%$ in gender-based subgroup comparisons and up to $97.9\%$ in cluster-based subgroups) and improved or maintained aggregate predictive utility. The approach is model- and data-agnostic, easily deployed as a post-processing step with a human-in-the-loop control, and demonstrates practical, interpretable improvements in subgroup fairness without requiring data rebalancing or algorithm changes.

Abstract

In an effort to improve the accuracy of credit lending decisions, many financial intuitions are now using predictions from machine learning models. While such predictions enjoy many advantages, recent research has shown that the predictions have the potential to be biased and unfair towards certain subgroups of the population. To combat this, several techniques have been introduced to help remove the bias and improve the overall fairness of the predictions. We introduce a new fairness technique, called \textit{Subgroup Threshold Optimizer} (\textit{STO}), that does not require any alternations to the input training data nor does it require any changes to the underlying machine learning algorithm, and thus can be used with any existing machine learning pipeline. STO works by optimizing the classification thresholds for individual subgroups in order to minimize the overall discrimination score between them. Our experiments on a real-world credit lending dataset show that STO can reduce gender discrimination by over 90\%.

Improving Fairness in Credit Lending Models using Subgroup Threshold Optimization

TL;DR

This work tackles fairness in credit lending predictions by addressing how excluding sensitive attributes can paradoxically increase unfairness due to historical bias. It introduces Subgroup Threshold Optimizer (STO), a post-processing method that assigns a distinct decision threshold to each subgroup, aiming to minimize the discrimination score while preserving or improving the overall PPV expressed as . Through two experiments on the Home Credit dataset, STO achieved large reductions in discrimination (over in gender-based subgroup comparisons and up to in cluster-based subgroups) and improved or maintained aggregate predictive utility. The approach is model- and data-agnostic, easily deployed as a post-processing step with a human-in-the-loop control, and demonstrates practical, interpretable improvements in subgroup fairness without requiring data rebalancing or algorithm changes.

Abstract

In an effort to improve the accuracy of credit lending decisions, many financial intuitions are now using predictions from machine learning models. While such predictions enjoy many advantages, recent research has shown that the predictions have the potential to be biased and unfair towards certain subgroups of the population. To combat this, several techniques have been introduced to help remove the bias and improve the overall fairness of the predictions. We introduce a new fairness technique, called \textit{Subgroup Threshold Optimizer} (\textit{STO}), that does not require any alternations to the input training data nor does it require any changes to the underlying machine learning algorithm, and thus can be used with any existing machine learning pipeline. STO works by optimizing the classification thresholds for individual subgroups in order to minimize the overall discrimination score between them. Our experiments on a real-world credit lending dataset show that STO can reduce gender discrimination by over 90\%.
Paper Structure (15 sections, 7 equations, 3 figures, 3 tables)

This paper contains 15 sections, 7 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Biases in a the different stages of an ML Pipeline. The typical ML pipeline consists of three main stages (i.e., pre-processing, in-processing, and post-processing). Different types of biases can be introduced at each stage. Researchers have proposed fairness solutions that work at any of the three stages.
  • Figure 2: An ML model predicts probability estimates for all data instances, and a single threshold $\tau$ is used to classify each instance into outcomes.
  • Figure 3: An ML model predicts probability estimates for all data instances. The estimates are split into subgroups and a separate threshold is used to classify each subgroups' instances into outcomes.