Evaluating AI fairness in credit scoring with the BRIO tool

Greta Coraglia; Francesco A. Genco; Pellegrino Piantadosi; Enrico Bagli; Pietro Giuffrida; Davide Posillipo; Giuseppe Primiero

Evaluating AI fairness in credit scoring with the BRIO tool

Greta Coraglia, Francesco A. Genco, Pellegrino Piantadosi, Enrico Bagli, Pietro Giuffrida, Davide Posillipo, Giuseppe Primiero

TL;DR

This work addresses fairness in AI-driven credit scoring by applying the model-agnostic BRIO tool to quantify bias and risk across sensitive attributes in the German Credit Dataset. The authors construct a transparent credit-scorecard using Optibinning, achieving $\text{AUC}=0.8$ and $\text{Gini}=0.6$, and then quantify fairness risk via BRIO’s aggregated tests, including divergence-based measures. A new integrated risk metric combines multiple fairness checks, and a revenue analysis links fairness risk to provisions and profit, revealing a potential threshold (around $620$) that balances fairness and profitability. Overall, the study demonstrates how BRIO can guide fair lending practices while considering financial implications, and it outlines paths to extend the approach to other datasets and contexts.

Abstract

We present a method for quantitative, in-depth analyses of fairness issues in AI systems with an application to credit scoring. To this aim we use BRIO, a tool for the evaluation of AI systems with respect to social unfairness and, more in general, ethically undesirable behaviours. It features a model-agnostic bias detection module, presented in \cite{DBLP:conf/beware/CoragliaDGGPPQ23}, to which a full-fledged unfairness risk evaluation module is added. As a case study, we focus on the context of credit scoring, analysing the UCI German Credit Dataset \cite{misc_statlog_(german_credit_data)_144}. We apply the BRIO fairness metrics to several, socially sensitive attributes featured in the German Credit Dataset, quantifying fairness across various demographic segments, with the aim of identifying potential sources of bias and discrimination in a credit scoring model. We conclude by combining our results with a revenue analysis.

Evaluating AI fairness in credit scoring with the BRIO tool

TL;DR

and

, and then quantify fairness risk via BRIO’s aggregated tests, including divergence-based measures. A new integrated risk metric combines multiple fairness checks, and a revenue analysis links fairness risk to provisions and profit, revealing a potential threshold (around

) that balances fairness and profitability. Overall, the study demonstrates how BRIO can guide fair lending practices while considering financial implications, and it outlines paths to extend the approach to other datasets and contexts.

Abstract

Paper Structure (10 sections, 10 equations, 4 figures, 2 tables)

This paper contains 10 sections, 10 equations, 4 figures, 2 tables.

Introduction
Preliminary Analysis
ML model construction
Fairness violation analysis in BRIO
Kullback-Leibler divergence.
Jensen-Shannon divergence.
Risk assessment in BRIO
Risk analysis via BRIO for the German Credit Dataset
Revenue analysis
Conclusions

Figures (4)

Figure 1: Default probability (red line, left vertical axis) and distributions (blue-orange bars, right vertical axis) for the sensitive variables.
Figure 2: ROC curve of the model (left) and Good-Bad performance distributions relative to the predicted score.
Figure 3: Comparison between the model's default risk (green line) and the data's default risk (red line) for the sensitive variables.
Figure 4: Trends of profit (green bars, right vertical axis) and the model-data risk difference (pink line, left vertical axis) for multiple score thresholds.

Evaluating AI fairness in credit scoring with the BRIO tool

TL;DR

Abstract

Evaluating AI fairness in credit scoring with the BRIO tool

Authors

TL;DR

Abstract

Table of Contents

Figures (4)