Table of Contents
Fetching ...

An experimental study on fairness-aware machine learning for credit scoring problem

Huyen Giang Thi Thu, Thang Viet Doan, Tai Le Quy

TL;DR

The paper investigates fairness in credit scoring by conducting a comprehensive empirical study of fairness-aware ML methods (pre-, in-, and post-processing) across six public financial datasets. It shows that dataset bias exists, and fairness-aware techniques can improve equitable outcomes while maintaining competitive accuracy, with AdaFair often delivering strong performance across multiple datasets. The findings offer practical guidance for lenders and regulators on selecting fairness approaches and metrics, highlighting trade-offs among SP, EO, EOd, PP, PE, TE, and ABROCA. The work also points to future directions, including multi-attribute fairness, synthetic data generation for finance, and interpretable fair models to diagnose and mitigate bias sources.

Abstract

Digitalization of credit scoring is an essential requirement for financial organizations and commercial banks, especially in the context of digital transformation. Machine learning techniques are commonly used to evaluate customers' creditworthiness. However, the predicted outcomes of machine learning models can be biased toward protected attributes, such as race or gender. Numerous fairness-aware machine learning models and fairness measures have been proposed. Nevertheless, their performance in the context of credit scoring has not been thoroughly investigated. In this paper, we present a comprehensive experimental study of fairness-aware machine learning in credit scoring. The study explores key aspects of credit scoring, including financial datasets, predictive models, and fairness measures. We also provide a detailed evaluation of fairness-aware predictive models and fairness measures on widely used financial datasets.

An experimental study on fairness-aware machine learning for credit scoring problem

TL;DR

The paper investigates fairness in credit scoring by conducting a comprehensive empirical study of fairness-aware ML methods (pre-, in-, and post-processing) across six public financial datasets. It shows that dataset bias exists, and fairness-aware techniques can improve equitable outcomes while maintaining competitive accuracy, with AdaFair often delivering strong performance across multiple datasets. The findings offer practical guidance for lenders and regulators on selecting fairness approaches and metrics, highlighting trade-offs among SP, EO, EOd, PP, PE, TE, and ABROCA. The work also points to future directions, including multi-attribute fairness, synthetic data generation for finance, and interpretable fair models to diagnose and mitigate bias sources.

Abstract

Digitalization of credit scoring is an essential requirement for financial organizations and commercial banks, especially in the context of digital transformation. Machine learning techniques are commonly used to evaluate customers' creditworthiness. However, the predicted outcomes of machine learning models can be biased toward protected attributes, such as race or gender. Numerous fairness-aware machine learning models and fairness measures have been proposed. Nevertheless, their performance in the context of credit scoring has not been thoroughly investigated. In this paper, we present a comprehensive experimental study of fairness-aware machine learning in credit scoring. The study explores key aspects of credit scoring, including financial datasets, predictive models, and fairness measures. We also provide a detailed evaluation of fairness-aware predictive models and fairness measures on widely used financial datasets.
Paper Structure (24 sections, 7 equations, 15 figures, 8 tables)

This paper contains 24 sections, 7 equations, 15 figures, 8 tables.

Figures (15)

  • Figure 1: The use of credit scoring datasets
  • Figure 2: Credit approval: Bayesian network (class label: Class, protected attributes: Age, Sex.
  • Figure 3: Credit approval: Relationship between class label and Bank account attribute.
  • Figure 4: Credit scoring: Bayesian network (class label: Label, protected attributes: Age, Marital, Sex.
  • Figure 5: Home Credit: Bayesian network (class label: TARGET, protected attributes: CODE_GENDER, NAME_FAMILY_STATUS).
  • ...and 10 more figures