The Fairness of Credit Scoring Models

Christophe Hurlin; Christophe Pérignon; Sébastien Saurin

The Fairness of Credit Scoring Models

Christophe Hurlin, Christophe Pérignon, Sébastien Saurin

TL;DR

This paper tackles the problem of algorithmic fairness in credit scoring by proposing a formal framework to test fairness, diagnose the drivers of bias, and mitigate disparities while preserving predictive accuracy. It combines a likelihood-ratio fairness inference, a novel FPDP interpretability method to identify candidate variables, and a post-processing mitigation approach that neutralizes selected features and uses Pareto-front optimization to balance fairness and performance. Empirical validation on the German and Taiwan credit datasets shows that removing or neutralizing a small set of proxy variables can restore fairness with modest losses in accuracy, while hyperparameter choices can strongly affect outcomes, highlighting operational and regulatory risks. The work provides practical tools for lenders and regulators to monitor, diagnose, and improve fair lending practices in high-stakes credit decisions, with potential applicability to other automated decision processes.

Abstract

In credit markets, screening algorithms aim to discriminate between good-type and bad-type borrowers. However, when doing so, they can also discriminate between individuals sharing a protected attribute (e.g. gender, age, racial origin) and the rest of the population. This can be unintentional and originate from the training dataset or from the model itself. We show how to formally test the algorithmic fairness of scoring models and how to identify the variables responsible for any lack of fairness. We then use these variables to optimize the fairness-performance trade-off. Our framework provides guidance on how algorithmic fairness can be monitored by lenders, controlled by their regulators, improved for the benefit of protected groups, while still maintaining a high level of forecasting accuracy.

The Fairness of Credit Scoring Models

TL;DR

Abstract

Paper Structure (32 sections, 1 theorem, 16 equations, 15 figures, 12 tables)

This paper contains 32 sections, 1 theorem, 16 equations, 15 figures, 12 tables.

Introduction
Literature review
Literature in financial economics
Literature in machine learning
Measuring fairness
Framework and notations
Fairness definitions
Fairness diagnosis
Fairness inference
Interpretability
Mitigation
Application
Data
Credit scoring models
Fairness diagnosis
...and 17 more sections

Key Result

Theorem 1

Under the null hypothesis of fairness $\text{H}_{0,i}$, the test statistic $F_{H_{0,i}}$ converges in distribution to a chi-squared distribution as the sample size $n$ tends to infinity:

Figures (15)

Figure 1: Measures of association between features, target variables, and gender
Figure 2: Fairness PDP for the statistical parity in TREE-prime model
Figure 3: Accuracy-fairness trade-off
Figure A1: Feature distributions
Figure A2: Feature distribution by class of risk
...and 10 more figures

Theorems & Definitions (8)

Definition 1
Definition 2
Definition 3
Definition 4
Theorem : Fairness test
Definition 5
Definition 6
proof

The Fairness of Credit Scoring Models

TL;DR

Abstract

The Fairness of Credit Scoring Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (8)