Table of Contents
Fetching ...

Predicting Satisfaction of Counterfactual Explanations from Human Ratings of Explanatory Qualities

Marharyta Domnich, Rasmus Moorits Veski, Julius Välja, Kadi Tulver, Raul Vicente

TL;DR

The paper tackles predicting user satisfaction with counterfactual explanations by modeling Overall Satisfaction from seven explanatory qualities. Using CounterEval, a dataset of 30 scenarios evaluated by 206 participants on eight metrics, the authors train regression and classification models under random and scenario-based splits and apply SHAP analyses for feature importance. In regression, Feasibility ($β\approx0.358$) and Trust ($β\approx0.362$) are the strongest predictors, Completeness also contributes, and even when Feasibility and Trust are omitted the $R^2$ remains about $0.58$; SHAP confirms the primacy of Feasibility and Trust. These findings guide the design of adaptive counterfactual explanations tailored to user expertise and domain context, showing that multi-metric evaluation yields richer, more actionable insights than relying on Overall Satisfaction alone.

Abstract

Counterfactual explanations are a widely used approach in Explainable AI, offering actionable insights into decision-making by illustrating how small changes to input data can lead to different outcomes. Despite their importance, evaluating the quality of counterfactual explanations remains an open problem. Traditional quantitative metrics, such as sparsity or proximity, fail to fully account for human preferences in explanations, while user studies are insightful but not scalable. Moreover, relying only on a single overall satisfaction rating does not lead to a nuanced understanding of why certain explanations are effective or not. To address this, we analyze a dataset of counterfactual explanations that were evaluated by 206 human participants, who rated not only overall satisfaction but also seven explanatory criteria: feasibility, coherence, complexity, understandability, completeness, fairness, and trust. Modeling overall satisfaction as a function of these criteria, we find that feasibility (the actionability of suggested changes) and trust (the belief that the changes would lead to the desired outcome) consistently stand out as the strongest predictors of user satisfaction, though completeness also emerges as a meaningful contributor. Crucially, even excluding feasibility and trust, other metrics explain 58% of the variance, highlighting the importance of additional explanatory qualities. Complexity appears independent, suggesting more detailed explanations do not necessarily reduce satisfaction. Strong metric correlations imply a latent structure in how users judge quality, and demographic background significantly shapes ranking patterns. These insights inform the design of counterfactual algorithms that adapt explanatory qualities to user expertise and domain context.

Predicting Satisfaction of Counterfactual Explanations from Human Ratings of Explanatory Qualities

TL;DR

The paper tackles predicting user satisfaction with counterfactual explanations by modeling Overall Satisfaction from seven explanatory qualities. Using CounterEval, a dataset of 30 scenarios evaluated by 206 participants on eight metrics, the authors train regression and classification models under random and scenario-based splits and apply SHAP analyses for feature importance. In regression, Feasibility () and Trust () are the strongest predictors, Completeness also contributes, and even when Feasibility and Trust are omitted the remains about ; SHAP confirms the primacy of Feasibility and Trust. These findings guide the design of adaptive counterfactual explanations tailored to user expertise and domain context, showing that multi-metric evaluation yields richer, more actionable insights than relying on Overall Satisfaction alone.

Abstract

Counterfactual explanations are a widely used approach in Explainable AI, offering actionable insights into decision-making by illustrating how small changes to input data can lead to different outcomes. Despite their importance, evaluating the quality of counterfactual explanations remains an open problem. Traditional quantitative metrics, such as sparsity or proximity, fail to fully account for human preferences in explanations, while user studies are insightful but not scalable. Moreover, relying only on a single overall satisfaction rating does not lead to a nuanced understanding of why certain explanations are effective or not. To address this, we analyze a dataset of counterfactual explanations that were evaluated by 206 human participants, who rated not only overall satisfaction but also seven explanatory criteria: feasibility, coherence, complexity, understandability, completeness, fairness, and trust. Modeling overall satisfaction as a function of these criteria, we find that feasibility (the actionability of suggested changes) and trust (the belief that the changes would lead to the desired outcome) consistently stand out as the strongest predictors of user satisfaction, though completeness also emerges as a meaningful contributor. Crucially, even excluding feasibility and trust, other metrics explain 58% of the variance, highlighting the importance of additional explanatory qualities. Complexity appears independent, suggesting more detailed explanations do not necessarily reduce satisfaction. Strong metric correlations imply a latent structure in how users judge quality, and demographic background significantly shapes ranking patterns. These insights inform the design of counterfactual algorithms that adapt explanatory qualities to user expertise and domain context.

Paper Structure

This paper contains 18 sections, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Mean ratings of each metric (Overall Satisfaction, Feasibility, Consistency, Completeness, Trust, Understandability, Fairness, and Complexity) across all 30 scenarios. The y-axis shows the average participant rating per scenario, with Complexity on a -2 to +2 scale (0 = ideal complexity) and the other metrics on a 1–6 scale.
  • Figure 2: Per-metric distributions grouped by Satisfaction level (low, medium, high). Each histogram is color-coded by the participant's Satisfaction category, illustrating how Feasibility, Consistency, Fairness, Completeness, Trust, Understandability, and Complexity vary for each class. The final subplot depicts the overall Satisfaction distribution itself.
  • Figure 3: Bi-clustering results for all metrics. Each heatmap shows participants (rows) and scenarios (columns) reordered into $k$ co-clusters, where $k$ is selected by minimizing the reconstruction error.
  • Figure 4: Distribution of evaluators' background (Age, Education, Machine Learning Experience, etc.) within the four discovered clusters. Chi-square tests (with p-values and degrees of freedom) assess whether the distributions differ significantly among clusters. We find that Machine Learning Experience (p=0.0299) and Medical Background (p=0.0162) differ significantly across clusters, whereas Age, Education, Counterfactual Explanations Experience, and Metric Understanding do not show significant differences.
  • Figure 5: SHAP analysis illustrating global and local feature importances for the Random Forest Regression model.
  • ...and 3 more figures