Table of Contents
Fetching ...

Generating Feasible and Plausible Counterfactual Explanations for Outcome Prediction of Business Processes

Alexander Stevens, Chun Ouyang, Johannes De Smedt, Catarina Moreira

TL;DR

A data-driven approach to generate more feasible and plausible counterfactual explanations, restricting the counterfactual algorithm to generate counterfactuals that lie within a high-density region of the process data, ensuring that the proposed counterfactuals are realistic and feasible within the observed process data distribution.

Abstract

In recent years, various machine and deep learning architectures have been successfully introduced to the field of predictive process analytics. Nevertheless, the inherent opacity of these algorithms poses a significant challenge for human decision-makers, hindering their ability to understand the reasoning behind the predictions. This growing concern has sparked the introduction of counterfactual explanations, designed as human-understandable what if scenarios, to provide clearer insights into the decision-making process behind undesirable predictions. The generation of counterfactual explanations, however, encounters specific challenges when dealing with the sequential nature of the (business) process cases typically used in predictive process analytics. Our paper tackles this challenge by introducing a data-driven approach, REVISEDplus, to generate more feasible and plausible counterfactual explanations. First, we restrict the counterfactual algorithm to generate counterfactuals that lie within a high-density region of the process data, ensuring that the proposed counterfactuals are realistic and feasible within the observed process data distribution. Additionally, we ensure plausibility by learning sequential patterns between the activities in the process cases, utilising Declare language templates. Finally, we evaluate the properties that define the validity of counterfactuals.

Generating Feasible and Plausible Counterfactual Explanations for Outcome Prediction of Business Processes

TL;DR

A data-driven approach to generate more feasible and plausible counterfactual explanations, restricting the counterfactual algorithm to generate counterfactuals that lie within a high-density region of the process data, ensuring that the proposed counterfactuals are realistic and feasible within the observed process data distribution.

Abstract

In recent years, various machine and deep learning architectures have been successfully introduced to the field of predictive process analytics. Nevertheless, the inherent opacity of these algorithms poses a significant challenge for human decision-makers, hindering their ability to understand the reasoning behind the predictions. This growing concern has sparked the introduction of counterfactual explanations, designed as human-understandable what if scenarios, to provide clearer insights into the decision-making process behind undesirable predictions. The generation of counterfactual explanations, however, encounters specific challenges when dealing with the sequential nature of the (business) process cases typically used in predictive process analytics. Our paper tackles this challenge by introducing a data-driven approach, REVISEDplus, to generate more feasible and plausible counterfactual explanations. First, we restrict the counterfactual algorithm to generate counterfactuals that lie within a high-density region of the process data, ensuring that the proposed counterfactuals are realistic and feasible within the observed process data distribution. Additionally, we ensure plausibility by learning sequential patterns between the activities in the process cases, utilising Declare language templates. Finally, we evaluate the properties that define the validity of counterfactuals.
Paper Structure (20 sections, 7 equations, 2 figures, 5 tables, 1 algorithm)

This paper contains 20 sections, 7 equations, 2 figures, 5 tables, 1 algorithm.

Figures (2)

  • Figure 1: The counterfactual instances , , and . Both and lie in low-density regions and are therefore considered infeasible. The counterfactual is preferred over because there is a feasible path between the initial data point and the counterfactual. This figure is adapted from poyiadzi2020face
  • Figure 2: A simplified example of how the iBCM algorithm extracts the different trace and label-specific constraints. This figure is adapted from deeva2022predicting.