Table of Contents
Fetching ...

Auditing a Dutch Public Sector Risk Profiling Algorithm Using an Unsupervised Bias Detection Tool

Floris Holstege, Mackenzie Jorgensen, Kirtan Padh, Jurriaan Parie, Joel Persson, Krsto Prorokovic, Lukas Snoek

TL;DR

This paper tackles indirect discrimination in public-sector decision systems by proposing an unsupervised bias-detection framework that does not require demographic labels. It adapts Hierarchical Bias-Aware Clustering (HBAC) to audit DUO's risk-profiling algorithm, applying it to over 250,000 Dutch students across 2012–2023 and validating findings against aggregated CBS data on non-European migration background. Through a combination of real-world application and simulation studies, the work highlights practical pitfalls (e.g., post-selection bias, multiple testing) and provides methodological recommendations, including sample-splitting and Bonferroni corrections. An open-source Python package and web interface accompany the approach, aiming to support scalable, deliberative expert assessment of potential discrimination in algorithmic decision-making when demographic data are unavailable.

Abstract

Algorithms are increasingly used to automate or aid human decisions, yet recent research shows that these algorithms may exhibit bias across legally protected demographic groups. However, data on these groups may be unavailable to organizations or external auditors due to privacy legislation. This paper studies bias detection using an unsupervised clustering tool when data on demographic groups are unavailable. We collaborate with the Dutch Executive Agency for Education to audit an algorithm that was used to assign risk scores to college students at the national level in the Netherlands between 2012-2023. Our audit covers more than 250,000 students from the whole country. The unsupervised clustering tool highlights known disparities between students with a non-European migration background and Dutch origin. Our contributions are three-fold: (1) we assess bias in a real-world, large-scale and high-stakes decision-making process by a governmental organization; (2) we use simulation studies to highlight potential pitfalls of using the unsupervised clustering tool to detect true bias when demographic group data are unavailable and provide recommendations for valid inferences; (3) we provide the unsupervised clustering tool in an open-source library. Our work serves as a starting point for a deliberative assessment by human experts to evaluate potential discrimination in algorithmic-supported decision-making processes.

Auditing a Dutch Public Sector Risk Profiling Algorithm Using an Unsupervised Bias Detection Tool

TL;DR

This paper tackles indirect discrimination in public-sector decision systems by proposing an unsupervised bias-detection framework that does not require demographic labels. It adapts Hierarchical Bias-Aware Clustering (HBAC) to audit DUO's risk-profiling algorithm, applying it to over 250,000 Dutch students across 2012–2023 and validating findings against aggregated CBS data on non-European migration background. Through a combination of real-world application and simulation studies, the work highlights practical pitfalls (e.g., post-selection bias, multiple testing) and provides methodological recommendations, including sample-splitting and Bonferroni corrections. An open-source Python package and web interface accompany the approach, aiming to support scalable, deliberative expert assessment of potential discrimination in algorithmic decision-making when demographic data are unavailable.

Abstract

Algorithms are increasingly used to automate or aid human decisions, yet recent research shows that these algorithms may exhibit bias across legally protected demographic groups. However, data on these groups may be unavailable to organizations or external auditors due to privacy legislation. This paper studies bias detection using an unsupervised clustering tool when data on demographic groups are unavailable. We collaborate with the Dutch Executive Agency for Education to audit an algorithm that was used to assign risk scores to college students at the national level in the Netherlands between 2012-2023. Our audit covers more than 250,000 students from the whole country. The unsupervised clustering tool highlights known disparities between students with a non-European migration background and Dutch origin. Our contributions are three-fold: (1) we assess bias in a real-world, large-scale and high-stakes decision-making process by a governmental organization; (2) we use simulation studies to highlight potential pitfalls of using the unsupervised clustering tool to detect true bias when demographic group data are unavailable and provide recommendations for valid inferences; (3) we provide the unsupervised clustering tool in an open-source library. Our work serves as a starting point for a deliberative assessment by human experts to evaluate potential discrimination in algorithmic-supported decision-making processes.

Paper Structure

This paper contains 24 sections, 4 equations, 15 figures, 3 tables, 1 algorithm.

Figures (15)

  • Figure 1: Schematic overview of the steps involved in applying the unsupervised bias detection tool. The required information is a dataset, a classifier, and a bias metric. Part of the dataset is used to train the Hierarchical Bias-Aware Clustering algorithm MISZTALRADECKA2021102519. Another part of the dataset is used to test whether differences in the bias metric across clusters are statistically significant.
  • Figure 2: The percentage of students (a) deemed as "high risk" and (b) estimated as having a non-European migration background for identified clusters based on the student population of 2014, excluding students for which the risk profile was unknown ($n=214,599$).
  • Figure 3: The percentage of students in each cluster for the three characteristics used in the risk profiling algorithm: (a) type of education, (b) age, and (c) distance to parents within the student population of 2014, excluding students for which the risk profile was unknown ($n=214,599$). For each subgroup, the average is indicated by the dashed line.
  • Figure 4: The percentage of students with a non-European migration background for the three used profiling characteristics: (a) type of education, (b) age and (c) distance to parents for students in the CUB 2014 dataset ($n=248,650$), excluding students where the distance is unknown.
  • Figure 5: The percentage of students with a non-European migration background for three (bivariate) combinations of the used profiling characteristics: (a) type of education and age, (b) type of education and distance to parents, and (c) age and distance to parents for the CUB 2014 dataset ($n=248,650$), excluding students where the distance is unknown.
  • ...and 10 more figures