Table of Contents
Fetching ...

Self-Explaining Neural Networks for Business Process Monitoring

Shahaf Bassan, Shlomit Gur, Sergey Zeltyn, Konstantinos Mavrogiorgos, Ron Eliav, Dimosthenis Kyriazis

TL;DR

This work targets predictive business process monitoring (PBPM) and introduces the first self-explaining neural network for PBPM, addressing the Next Activity Prediction task with built-in explanations. By extending an LSTM-based predictor with an explanation head and employing dual propagation alongside faithfulness and cardinality losses, the approach yields concise, sufficient explanations while preserving or improving predictive accuracy. The method demonstrates substantial gains in explanation faithfulness and computational efficiency over post-hoc baselines like Anchors, across four real-world datasets. Overall, the results establish that integrated self-explanations can enhance trust and practicality in PBPM without compromising performance.

Abstract

Tasks in Predictive Business Process Monitoring (PBPM), such as Next Activity Prediction, focus on generating useful business predictions from historical case logs. Recently, Deep Learning methods, particularly sequence-to-sequence models like Long Short-Term Memory (LSTM), have become a dominant approach for tackling these tasks. However, to enhance model transparency, build trust in the predictions, and gain a deeper understanding of business processes, it is crucial to explain the decisions made by these models. Existing explainability methods for PBPM decisions are typically *post-hoc*, meaning they provide explanations only after the model has been trained. Unfortunately, these post-hoc approaches have shown to face various challenges, including lack of faithfulness, high computational costs and a significant sensitivity to out-of-distribution samples. In this work, we introduce, to the best of our knowledge, the first *self-explaining neural network* architecture for predictive process monitoring. Our framework trains an LSTM model that not only provides predictions but also outputs a concise explanation for each prediction, while adapting the optimization objective to improve the reliability of the explanation. We first demonstrate that incorporating explainability into the training process does not hurt model performance, and in some cases, actually improves it. Additionally, we show that our method outperforms post-hoc approaches in terms of both the faithfulness of the generated explanations and substantial improvements in efficiency.

Self-Explaining Neural Networks for Business Process Monitoring

TL;DR

This work targets predictive business process monitoring (PBPM) and introduces the first self-explaining neural network for PBPM, addressing the Next Activity Prediction task with built-in explanations. By extending an LSTM-based predictor with an explanation head and employing dual propagation alongside faithfulness and cardinality losses, the approach yields concise, sufficient explanations while preserving or improving predictive accuracy. The method demonstrates substantial gains in explanation faithfulness and computational efficiency over post-hoc baselines like Anchors, across four real-world datasets. Overall, the results establish that integrated self-explanations can enhance trust and practicality in PBPM without compromising performance.

Abstract

Tasks in Predictive Business Process Monitoring (PBPM), such as Next Activity Prediction, focus on generating useful business predictions from historical case logs. Recently, Deep Learning methods, particularly sequence-to-sequence models like Long Short-Term Memory (LSTM), have become a dominant approach for tackling these tasks. However, to enhance model transparency, build trust in the predictions, and gain a deeper understanding of business processes, it is crucial to explain the decisions made by these models. Existing explainability methods for PBPM decisions are typically *post-hoc*, meaning they provide explanations only after the model has been trained. Unfortunately, these post-hoc approaches have shown to face various challenges, including lack of faithfulness, high computational costs and a significant sensitivity to out-of-distribution samples. In this work, we introduce, to the best of our knowledge, the first *self-explaining neural network* architecture for predictive process monitoring. Our framework trains an LSTM model that not only provides predictions but also outputs a concise explanation for each prediction, while adapting the optimization objective to improve the reliability of the explanation. We first demonstrate that incorporating explainability into the training process does not hurt model performance, and in some cases, actually improves it. Additionally, we show that our method outperforms post-hoc approaches in terms of both the faithfulness of the generated explanations and substantial improvements in efficiency.

Paper Structure

This paper contains 19 sections, 5 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: An illustration of the "traditional" lstm-based rnn architecture tax2017lstm.
  • Figure 2: An illustration of the dual propagation procedure used in our self-explaining framework, along with the new loss terms: the faithfulness loss $\mathcal{L}_{Faith}$, which ensures the generated explanation is sufficient, and the cardinality loss $\mathcal{L}_{Card}$, which ensures the explanation remains concise. It is important to note that the hidden layers of the model are shared across both propagations.
  • Figure 3: An example of an explanation generated either by Anchors on standard trained models or inherently by our self-explaining approach, when applied to the prediction of the third activity in a BPI12wc process. The text in black represents the sufficient explanation, while the features not included in the explanation appear in gray.
  • Figure 4: An example of an explanation generated either by Anchors on standard trained models or inherently by our self-explaining approach, when applied to the prediction of the fourth activity in a BPI12wc process. The text in black represents the sufficient explanation, while the features not included in the explanation appear in gray.