Predicting Satisfaction of Counterfactual Explanations from Human Ratings of Explanatory Qualities

Marharyta Domnich; Rasmus Moorits Veski; Julius Välja; Kadi Tulver; Raul Vicente

Predicting Satisfaction of Counterfactual Explanations from Human Ratings of Explanatory Qualities

Marharyta Domnich, Rasmus Moorits Veski, Julius Välja, Kadi Tulver, Raul Vicente

TL;DR

The paper tackles predicting user satisfaction with counterfactual explanations by modeling Overall Satisfaction from seven explanatory qualities. Using CounterEval, a dataset of 30 scenarios evaluated by 206 participants on eight metrics, the authors train regression and classification models under random and scenario-based splits and apply SHAP analyses for feature importance. In regression, Feasibility ($β\approx0.358$) and Trust ($β\approx0.362$) are the strongest predictors, Completeness also contributes, and even when Feasibility and Trust are omitted the $R^2$ remains about $0.58$; SHAP confirms the primacy of Feasibility and Trust. These findings guide the design of adaptive counterfactual explanations tailored to user expertise and domain context, showing that multi-metric evaluation yields richer, more actionable insights than relying on Overall Satisfaction alone.

Abstract

Counterfactual explanations are a widely used approach in Explainable AI, offering actionable insights into decision-making by illustrating how small changes to input data can lead to different outcomes. Despite their importance, evaluating the quality of counterfactual explanations remains an open problem. Traditional quantitative metrics, such as sparsity or proximity, fail to fully account for human preferences in explanations, while user studies are insightful but not scalable. Moreover, relying only on a single overall satisfaction rating does not lead to a nuanced understanding of why certain explanations are effective or not. To address this, we analyze a dataset of counterfactual explanations that were evaluated by 206 human participants, who rated not only overall satisfaction but also seven explanatory criteria: feasibility, coherence, complexity, understandability, completeness, fairness, and trust. Modeling overall satisfaction as a function of these criteria, we find that feasibility (the actionability of suggested changes) and trust (the belief that the changes would lead to the desired outcome) consistently stand out as the strongest predictors of user satisfaction, though completeness also emerges as a meaningful contributor. Crucially, even excluding feasibility and trust, other metrics explain 58% of the variance, highlighting the importance of additional explanatory qualities. Complexity appears independent, suggesting more detailed explanations do not necessarily reduce satisfaction. Strong metric correlations imply a latent structure in how users judge quality, and demographic background significantly shapes ranking patterns. These insights inform the design of counterfactual algorithms that adapt explanatory qualities to user expertise and domain context.

Predicting Satisfaction of Counterfactual Explanations from Human Ratings of Explanatory Qualities

TL;DR

Abstract

Predicting Satisfaction of Counterfactual Explanations from Human Ratings of Explanatory Qualities

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)