The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement

Jonathan Kamp; Lisa Beinborn; Antske Fokkens

The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement

Jonathan Kamp, Lisa Beinborn, Antske Fokkens

TL;DR

This work explains disagreement from a linguistic perspective, and methodically explores the different settings of the dynamic *k* approach, finding that its combination with spans yields favourable results in capturing important signals in the sentence, and proposes an improved setting of global token importance.

Abstract

Post-hoc explanation methods are an important tool for increasing model transparency for users. Unfortunately, the currently used methods for attributing token importance often yield diverging patterns. In this work, we study potential sources of disagreement across methods from a linguistic perspective. We find that different methods systematically select different classes of words and that methods that agree most with other methods and with humans display similar linguistic preferences. Token-level differences between methods are smoothed out if we compare them on the syntactic span level. We also find higher agreement across methods by estimating the most important spans dynamically instead of relying on a fixed subset of size $k$. We systematically investigate the interaction between $k$ and spans and propose an improved configuration for selecting important tokens.

The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement

TL;DR

Abstract

. We systematically investigate the interaction between

and spans and propose an improved configuration for selecting important tokens.

Paper Structure (23 sections, 2 equations, 3 figures, 6 tables)

This paper contains 23 sections, 2 equations, 3 figures, 6 tables.

Introduction
Related Work
Model Interpretation
Linguistic Patterns in Attributions
Top-$k$ Estimation
Linguistic Analysis
Setup
Preference for a Word Class
Span Definition
Head vs. Modifier Preference
Agreement at the Span Level
Setup
The Effect of Dynamic $k$ on Spans
Adjusting Dynamic $k$
Discussion
...and 8 more sections

Figures (3)

Figure 1: Top-$k$ highlights (light background) per attribution method and human preference for $k=4$. The syntactic spans are given underneath.
Figure 2: Preference for different word classes per attribution method.
Figure 3: Span agreement for fixed and dynamic $k$.

The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement

TL;DR

Abstract

The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement

Authors

TL;DR

Abstract

Table of Contents

Figures (3)