Table of Contents
Fetching ...

"Let's Agree to Disagree": Investigating the Disagreement Problem in Explainable AI for Text Summarization

Seema Aswani, Sujala D. Shetty

TL;DR

This study investigates the disagreement problem in explainable AI for text summarization, where different XAI methods yield conflicting explanations for the same model outcome. It introduces Regional Explainable AI (RXAI), a segmentation-based approach that partitions articles into semantically coherent regions to generate localized explanations via multiple XAI methods. Empirical evaluation on XSum and CNN/Daily Mail shows RXAI substantially improves cross-method agreement at the regional level, particularly for feature and rank-based metrics, and provides an open-source JavaScript visualization for sentence-level attribution exploration. Together, these contributions advance trustworthy, interpretable summarization by offering a practical framework and tooling to analyze and visualize explanations at a finer Granularity.

Abstract

Explainable Artificial Intelligence (XAI) methods in text summarization are essential for understanding the model behavior and fostering trust in model-generated summaries. Despite the effectiveness of XAI methods, recent studies have highlighted a key challenge in this area known as the "disagreement problem". This problem occurs when different XAI methods yield conflicting explanations for the same model outcome. Such discrepancies raise concerns about the consistency of explanations and reduce confidence in model interpretations, which is crucial for secure and accountable AI applications. This work is among the first to empirically investigate the disagreement problem in text summarization, demonstrating that such discrepancies are widespread in state-of-the-art summarization models. To address this gap, we propose Regional Explainable AI (RXAI) a novel segmentation-based approach, where each article is divided into smaller, coherent segments using sentence transformers and clustering. We use XAI methods on text segments to create localized explanations that help reduce disagreement between different XAI methods, thereby enhancing the trustworthiness of AI-generated summaries. Our results illustrate that the localized explanations are more consistent than full-text explanations. The proposed approach is validated using two benchmark summarization datasets, Extreme summarization (Xsum) and CNN/Daily Mail, indicating a substantial decrease in disagreement. Additionally, the interactive JavaScript visualization tool is developed to facilitate easy, color-coded exploration of attribution scores at the sentence level, enhancing user comprehension of model explanations.

"Let's Agree to Disagree": Investigating the Disagreement Problem in Explainable AI for Text Summarization

TL;DR

This study investigates the disagreement problem in explainable AI for text summarization, where different XAI methods yield conflicting explanations for the same model outcome. It introduces Regional Explainable AI (RXAI), a segmentation-based approach that partitions articles into semantically coherent regions to generate localized explanations via multiple XAI methods. Empirical evaluation on XSum and CNN/Daily Mail shows RXAI substantially improves cross-method agreement at the regional level, particularly for feature and rank-based metrics, and provides an open-source JavaScript visualization for sentence-level attribution exploration. Together, these contributions advance trustworthy, interpretable summarization by offering a practical framework and tooling to analyze and visualize explanations at a finer Granularity.

Abstract

Explainable Artificial Intelligence (XAI) methods in text summarization are essential for understanding the model behavior and fostering trust in model-generated summaries. Despite the effectiveness of XAI methods, recent studies have highlighted a key challenge in this area known as the "disagreement problem". This problem occurs when different XAI methods yield conflicting explanations for the same model outcome. Such discrepancies raise concerns about the consistency of explanations and reduce confidence in model interpretations, which is crucial for secure and accountable AI applications. This work is among the first to empirically investigate the disagreement problem in text summarization, demonstrating that such discrepancies are widespread in state-of-the-art summarization models. To address this gap, we propose Regional Explainable AI (RXAI) a novel segmentation-based approach, where each article is divided into smaller, coherent segments using sentence transformers and clustering. We use XAI methods on text segments to create localized explanations that help reduce disagreement between different XAI methods, thereby enhancing the trustworthiness of AI-generated summaries. Our results illustrate that the localized explanations are more consistent than full-text explanations. The proposed approach is validated using two benchmark summarization datasets, Extreme summarization (Xsum) and CNN/Daily Mail, indicating a substantial decrease in disagreement. Additionally, the interactive JavaScript visualization tool is developed to facilitate easy, color-coded exploration of attribution scores at the sentence level, enhancing user comprehension of model explanations.

Paper Structure

This paper contains 35 sections, 2 equations, 10 figures, 6 tables, 2 algorithms.

Figures (10)

  • Figure 1: System architectures for analyzing and addressing the disagreement problem in XAI methods. (a) Phase A performs global agreement analysis. (b) Phase B applies RXAI for regional evaluation. These phases can be executed in parallel, enabling a transition from a global to a more fine-grained analysis for improved explanation consistency.
  • Figure 2: The text plot of the attribution weights assigned over the input sentences using DeepLIFT. Normalized attribution scores are represented using color coding, where darker-colored sentences represent higher attribution weights, signifying the greater contribution to the resultant summary.
  • Figure 17: Global Feature Agreement across different top-$k$ values on XSum and CNN/DM datasets. The trend of global feature agreement scores across different method pairs is similar to the previous sample set analyzed in main results section \ref{['Disgreement_results']}.
  • Figure 18: Global Semantic Alignment Score (SAS) heatmaps for XSum and CNN/DM datasets across all k-values. The scores of global semantic alignment for this batch is consistent and depicts similar trend as other batches.
  • Figure 19: Global Spearman Rank Correlation analysis across XAI methods on XSum and CNN/DM dataset. The heatmap depicts similar trends as observed in the previous sample set of global rank correlation analysis.
  • ...and 5 more figures