Table of Contents
Fetching ...

Comparative Explanations via Counterfactual Reasoning in Recommendations

Yi Yu, Zhenxing Hu

TL;DR

This work tackles the fidelity of explanations in recommender systems by moving from reduction-based counterfactuals to a comparative counterfactual framework. CoCountER leverages differentiable swap operations on item aspects to generate counterfactual explanations for arbitrary item pairs, identifying the most influential aspects that flip ranking under a black-box predictor. Empirical results across three Amazon domains show that CoCountER consistently improves the Probability of Necessity and Probability of Sufficiency over baselines like CountER, demonstrating more faithful, context-aware explanations. The approach offers a practical pathway to more trustworthy explanations and suggests future integration with generative models to produce natural language counterfactuals.

Abstract

Explainable recommendation through counterfactual reasoning seeks to identify the influential aspects of items in recommendations, which can then be used as explanations. However, state-of-the-art approaches, which aim to minimize changes in product aspects while reversing their recommended decisions according to an aggregated decision boundary score, often lead to factual inaccuracies in explanations. To solve this problem, in this work we propose a novel method of Comparative Counterfactual Explanations for Recommendation (CoCountER). CoCountER creates counterfactual data based on soft swap operations, enabling explanations for recommendations of arbitrary pairs of comparative items. Empirical experiments validate the effectiveness of our approach.

Comparative Explanations via Counterfactual Reasoning in Recommendations

TL;DR

This work tackles the fidelity of explanations in recommender systems by moving from reduction-based counterfactuals to a comparative counterfactual framework. CoCountER leverages differentiable swap operations on item aspects to generate counterfactual explanations for arbitrary item pairs, identifying the most influential aspects that flip ranking under a black-box predictor. Empirical results across three Amazon domains show that CoCountER consistently improves the Probability of Necessity and Probability of Sufficiency over baselines like CountER, demonstrating more faithful, context-aware explanations. The approach offers a practical pathway to more trustworthy explanations and suggests future integration with generative models to produce natural language counterfactuals.

Abstract

Explainable recommendation through counterfactual reasoning seeks to identify the influential aspects of items in recommendations, which can then be used as explanations. However, state-of-the-art approaches, which aim to minimize changes in product aspects while reversing their recommended decisions according to an aggregated decision boundary score, often lead to factual inaccuracies in explanations. To solve this problem, in this work we propose a novel method of Comparative Counterfactual Explanations for Recommendation (CoCountER). CoCountER creates counterfactual data based on soft swap operations, enabling explanations for recommendations of arbitrary pairs of comparative items. Empirical experiments validate the effectiveness of our approach.

Paper Structure

This paper contains 13 sections, 10 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Matching-based versus counterfactual reasoning methods. The numerical values of the aspects reflect the user's focus on each aspect and the item's performance in each aspect, which are derived from user-reviews.
  • Figure 2: Our proposal of comparative counterfactual reasoning method. In this example, to explain the Headphone A, first select the reference item (B or C), then intervene by swapping the aspect values between comparative items to alter their relative rankings.
  • Figure 3: Hyper-parameter sensitivity analysis of CoCountER. (a) Effect of reference item position. (b) Effect of the number of reference items. Both PN and PS benefit from lower-ranked and moderately sized reference sets.