Achieving Fairness in Predictive Process Analytics via Adversarial Learning (Extended Version)

Massimiliano de Leoni; Alessandro Padella

Achieving Fairness in Predictive Process Analytics via Adversarial Learning (Extended Version)

Massimiliano de Leoni, Alessandro Padella

TL;DR

This work tackles fairness in predictive process analytics by integrating an adversarial debiasing phase that minimizes reliance on protected variables. The authors introduce a dual-FCNN framework where a predictor $\Phi$ forecasts process outcomes while an adversary $\Phi_Z$ attempts to recover protected-variable values, with a loss $L_{\overline V}$ that penalizes predictive leakage of protected information. They quantify variable influence via Shapley values and demonstrate substantial reductions in protected-variable impact across four case studies, while preserving competitive accuracy and improving Equalized Odds in classification tasks. Training with FCNNs offers strong performance and faster training times compared to LSTMs, enabling a practical, scalable fairness-enhancing approach for time-series process predictions. The results suggest that the proposed method yields fairer predictions without a prohibitive loss in predictive quality, and point to future work on alternative encodings and extending fairness to prescriptive analytics.

Abstract

Predictive business process analytics has become important for organizations, offering real-time operational support for their processes. However, these algorithms often perform unfair predictions because they are based on biased variables (e.g., gender or nationality), namely variables embodying discrimination. This paper addresses the challenge of integrating a debiasing phase into predictive business process analytics to ensure that predictions are not influenced by biased variables. Our framework leverages on adversial debiasing is evaluated on four case studies, showing a significant reduction in the contribution of biased variables to the predicted value. The proposed technique is also compared with the state of the art in fairness in process mining, illustrating that our framework allows for a more enhanced level of fairness, while retaining a better prediction quality.

Achieving Fairness in Predictive Process Analytics via Adversarial Learning (Extended Version)

TL;DR

forecasts process outcomes while an adversary

attempts to recover protected-variable values, with a loss

that penalizes predictive leakage of protected information. They quantify variable influence via Shapley values and demonstrate substantial reductions in protected-variable impact across four case studies, while preserving competitive accuracy and improving Equalized Odds in classification tasks. Training with FCNNs offers strong performance and faster training times compared to LSTMs, enabling a practical, scalable fairness-enhancing approach for time-series process predictions. The results suggest that the proposed method yields fairer predictions without a prohibitive loss in predictive quality, and point to future work on alternative encodings and extending fairness to prescriptive analytics.

Abstract

Paper Structure (13 sections, 1 equation, 5 figures, 6 tables)

This paper contains 13 sections, 1 equation, 5 figures, 6 tables.

Introduction
Related Works
Preliminaries
Predictive Process Analytics via Fully Connected Neural Networks
Assessment of the Variable Influence on Predictions
An Adversarial Debiasing Framework for Predictive Process Analytics
Evaluation
Introduction to Use Cases and Event-log Datasets
Selection of Protected Variables
Evaluation Metrics
Evaluation Results
Analysis of LSTM and FCNN accuracy and training times
Conclusion

Figures (5)

Figure 1: Example of Shapley Values in the prediction of the probability of occurrence for the activity "Open Loan" in a loan application process. The y-axis denotes the variable names assuming a certain value, while the x-axis represents the probability values. Shapley Values indicate deviations from the mean prediction value, that is 0.80.
Figure 2: The figure provides an overview of our debiasing framework for process' predictive analytics. Vector $\hbox{\boldmath$x$}$ is the encoding of the sequence of events of a running case. Two FCNN models are used within our framework, where $\hat{y}$ is the predicted outcome value, and $\hbox{\boldmath$z$}$ is the forecast of the values of the protected variables in $\hbox{\boldmath$x$}$, given the output of the last layer of the FCNN implementing $\Phi$. The overall model aims to accurately predict $\hat{y}$ while scoring poor to predict $\hbox{\boldmath$z$}$. The dots indicate the encoding layers to generate $\hat{y}$ and $\hbox{\boldmath$z$}$.
Figure 3: Shapley values for all variables for the VINST case study predicting the Total-Time outcome, sorted in descending order based on their absolute magnitudes. Shapley values are measured in hours.
Figure 4: Shapley values for all variables for the VINST case study predicting the eventual occurrence of activity Awaiting Assignment, sorted in descending order based on their absolute magnitudes.
Figure 5: Shapley values for all variables for the hospital case study predicting the eventual occurrence of activity Treatment Unsuccessful, sorted in descending order based on their absolute magnitudes.

Theorems & Definitions (3)

definition thmcounterdefinition: Events
definition thmcounterdefinition: Traces & Event Logs
definition thmcounterdefinition: The Process Prediction Problem

Achieving Fairness in Predictive Process Analytics via Adversarial Learning (Extended Version)

TL;DR

Abstract

Achieving Fairness in Predictive Process Analytics via Adversarial Learning (Extended Version)

Authors

TL;DR

Abstract

Table of Contents

Figures (5)

Theorems & Definitions (3)