Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs

Steve Azzolin; Antonio Longa; Stefano Teso; Andrea Passerini

Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs

Steve Azzolin, Antonio Longa, Stefano Teso, Andrea Passerini

TL;DR

This work tackles the problem of faithfulness in GNN explanations, formalizing faithfulness via sufficiency and necessity and introducing a harmonic-mean Faith score $Faith_{d,p_R,p_C}(R_A)$. It demonstrates that faithfulness metrics are not interchangeable: different perturbation schemes and interventional distributions $p_R$ and $p_C$ yield divergent assessments, underscoring the need to disclose metric parameters. The authors show a no-go result for injective regular GNNs where perfectly faithful explanations are uninformative, but reveal that modular GNNs (SE-GNNs and DI-GNNs) can produce non-trivial strictly faithful explanations under specific architectural strategies, albeit with trade-offs in expressivity and learnability. They also prove that faithfulness is tightly linked to domain invariance, providing a bound on the ID–OOD gap that depends on both invariance and sufficiency, and provide empirical evidence across SE-/DI-GNN benchmarks. The study highlights practical guidelines for evaluating faithfulness and its role in robust domain generalization, supported by open-source code.

Abstract

As Graph Neural Networks (GNNs) become more pervasive, it becomes paramount to build reliable tools for explaining their predictions. A core desideratum is that explanations are \textit{faithful}, \ie that they portray an accurate picture of the GNN's reasoning process. However, a number of different faithfulness metrics exist, begging the question of what is faithfulness exactly and how to achieve it. We make three key contributions. We begin by showing that \textit{existing metrics are not interchangeable} -- \ie explanations attaining high faithfulness according to one metric may be unfaithful according to others -- and can systematically ignore important properties of explanations. We proceed to show that, surprisingly, \textit{optimizing for faithfulness is not always a sensible design goal}. Specifically, we prove that for injective regular GNN architectures, perfectly faithful explanations are completely uninformative. This does not apply to modular GNNs, such as self-explainable and domain-invariant architectures, prompting us to study the relationship between architectural choices and faithfulness. Finally, we show that \textit{faithfulness is tightly linked to out-of-distribution generalization}, in that simply ensuring that a GNN can correctly recognize the domain-invariant subgraph, as prescribed by the literature, does not guarantee that it is invariant unless this subgraph is also faithful.The code is publicly available on GitHub

Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs

TL;DR

This work tackles the problem of faithfulness in GNN explanations, formalizing faithfulness via sufficiency and necessity and introducing a harmonic-mean Faith score

. It demonstrates that faithfulness metrics are not interchangeable: different perturbation schemes and interventional distributions

and

yield divergent assessments, underscoring the need to disclose metric parameters. The authors show a no-go result for injective regular GNNs where perfectly faithful explanations are uninformative, but reveal that modular GNNs (SE-GNNs and DI-GNNs) can produce non-trivial strictly faithful explanations under specific architectural strategies, albeit with trade-offs in expressivity and learnability. They also prove that faithfulness is tightly linked to domain invariance, providing a bound on the ID–OOD gap that depends on both invariance and sufficiency, and provide empirical evidence across SE-/DI-GNN benchmarks. The study highlights practical guidelines for evaluating faithfulness and its role in robust domain generalization, supported by open-source code.

Abstract

Paper Structure (32 sections, 13 theorems, 18 equations, 9 figures, 6 tables)

This paper contains 32 sections, 13 theorems, 18 equations, 9 figures, 6 tables.

Introduction
Graph Neural Networks and Faithfulness
Pitfalls of Faithfulness Estimation
Faithfulness metrics are not interchangable
Not all faithfulness estimators are equally good
Is Faithfulness Worth Optimizing For?
The Case of Regular GNNs
The Case of Modular GNNs
The Importance of Faithfulness for Domain Invariance
Related Work
Discussion and Broader Impact
Proofs
Proof of \ref{['prop:metrics-are-not-equivalent']}
Proof of \ref{['prop:rfid_expval']}
Proof of \ref{['prop:nec_expval']}
...and 17 more sections

Key Result

proposition 1

(Informal.) Let $(p_{R}\xspace, p_{C}\xspace)$ and $(p_{R}\xspace', p_{C}\xspace')$ two pairs of interventional distributions. Then, depending on $p_\theta$ and $G_A\xspace$, $|\mathsf{Suf}\xspace_{d,p_{R}\xspace}(R_A\xspace) - \mathsf{Suf}\xspace_{d,p_{R}\xspace'}(R_A\xspace)|$ and $|\mathsf{Nec}\x

Figures (9)

Figure 1: Dependency of the necessity metrics on the size of the explanation.$\mathsf{RFid}\hbox{+}$ and $\mathsf{Nec}$ of explanations output by LECI on Motif2-Basis (averaged over 5 seeds) for different explanation sizes (x-axis) and metric hyper-parameter $\kappa, b\xspace \in \{ 3\%, 5\% \}$. $\mathsf{RFid}\hbox{+}$ assigns similar or even higher scores to larger explanations, while $\mathsf{Nec}$ (with a budget $b\xspace$ proportional to the average graph size $\bar{m}$) decreases for increasing explanation size, as expected.
Figure 2: Popular modular architectures fail to fully implement faithfulness-enforcing strategies. SE-GNNs at the top and DI-GNNs on the bottom. ✗/✓ means that both variants exist and the choice is made via cross-validation.
Figure 2: Likelihood, faithfulness and domain-invariance are correlated. The plot shows the difference in likelihood between splits. The red line is the best linear fit. Best viewed in color.
Figure 3: Histograms of explanation relevance scores for LECI (top), and GSAT (bottom) on SST2-Length (seed $1$). Both models failed in identifying a sparse input subgraph, assigning constant scores (or very close thereof) to every edge.
Figure 4: Probability of deleting at least one truly relevant edge is independent of the number of irrelevant edges if the deletion budget depends on the explanation size. Given an explanation $R$ with $r$ truly relevant edges ($r=5$), and a budget b proportional to the size of the explanation, the plot shows $P(R' \in \mathcal{A}^b\xspace(R'))$ where $R' \sim p_{C}\xspace^b\xspace(G)$, for a growing number of irrelevant edges in $R$. The plot shows that the probability is approximately constant, i.e., it does not depend on the number of irrelevant edges. The segments with decreasing behaviour (especially visible for a $10\%$ budget, the blue curve) correspond to areas where the budget is indeed constant, and thus not proportional to the explanation size. For instance, between 1 and 14 irrelevant edges, a budget of $10\%$ corresponds to deleting one edge.
...and 4 more figures

Theorems & Definitions (22)

Definition 1
proposition 1
proposition 2
proposition 3
Example 1
proposition 4
proposition 5
proposition 6
theorem 1
proposition 6
...and 12 more

Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs

TL;DR

Abstract

Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (22)