EXAGREE: Mitigating Explanation Disagreement with Stakeholder-Aligned Models

Sichao Li; Tommy Liu; Quanling Deng; Amanda S. Barnard

EXAGREE: Mitigating Explanation Disagreement with Stakeholder-Aligned Models

Sichao Li, Tommy Liu, Quanling Deng, Amanda S. Barnard

TL;DR

EXAGREE introduces a stakeholder-centered approach to explanation disagreement by operating inside a Rashomon set of near-optimal models and selecting Stakeholder-Aligned Explanation Models (SAEMs) that maximize Stakeholder-Machine Agreement ($ ext{SMA}$). The framework combines a differentiable mask-based attribution network (DMAN) with differentiable sorting (DiffSortNet) and a multi-head mask network (MHMN) to explore diverse explanations while respecting a performance constraint. Empirical results on six real-world OpenXAI datasets show gains in faithfulness ($ ext{A}_{ ext{faith}}$) and plausibility ($ ext{A}_{ ext{plaus}}$), improved SMA, and reduced subgroup fairness gaps, with an LLM-assisted interface enabling natural-language stakeholder feedback. By turning explanation disagreement into a selection problem, EXAGREE provides a principled, practical path toward stakeholder-centered XAI in safety-critical domains.

Abstract

Conflicting explanations, arising from different attribution methods or model internals, limit the adoption of machine learning models in safety-critical domains. We turn this disagreement into an advantage and introduce EXplanation AGREEment (EXAGREE), a two-stage framework that selects a Stakeholder-Aligned Explanation Model (SAEM) from a set of similar-performing models. The selection maximizes Stakeholder-Machine Agreement (SMA), a single metric that unifies faithfulness and plausibility. EXAGREE couples a differentiable mask-based attribution network (DMAN) with monotone differentiable sorting, enabling gradient-based search inside the constrained model space. Experiments on six real-world datasets demonstrate simultaneous gains of faithfulness, plausibility, and fairness over baselines, while preserving task accuracy. Extensive ablation studies, significance tests, and case studies confirm the robustness and feasibility of the method in practice.

EXAGREE: Mitigating Explanation Disagreement with Stakeholder-Aligned Models

TL;DR

Abstract

EXAGREE: Mitigating Explanation Disagreement with Stakeholder-Aligned Models

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (22)