From Robustness to Explainability and Back Again

Xuanxiang Huang; Joao Marques-Silva

From Robustness to Explainability and Back Again

Xuanxiang Huang, Joao Marques-Silva

TL;DR

This paper tackles the scalability barrier of formal explainability by linking it to robustness analysis. It generalizes abductive and contrastive explanations to distance-restricted settings under a norm $l_p$ with distance threshold $\epsilon$, enabling the use of robustness tools as oracles to compute explanations. A duality between AXp and CXp, via minimal hitting sets, is extended to the distance-restricted case, and two practical algorithms (linear-search and QuickXplain-based) are proposed to compute distance-restricted AXp/CXp using robustness oracles. Experiments on ACAS Xu DNNs with hundreds of ReLU units demonstrate substantial scalability gains, showing that formal explanations can be obtained for larger models by leveraging robustness tooling, with implications for a broad range of ML classifiers.

Abstract

Formal explainability guarantees the rigor of computed explanations, and so it is paramount in domains where rigor is critical, including those deemed high-risk. Unfortunately, since its inception formal explainability has been hampered by poor scalability. At present, this limitation still holds true for some families of classifiers, the most significant being deep neural networks. This paper addresses the poor scalability of formal explainability and proposes novel efficient algorithms for computing formal explanations. The novel algorithm computes explanations by answering instead a number of robustness queries, and such that the number of such queries is at most linear on the number of features. Consequently, the proposed algorithm establishes a direct relationship between the practical complexity of formal explainability and that of robustness. To achieve the proposed goals, the paper generalizes the definition of formal explanations, thereby allowing the use of robustness tools that are based on different distance norms, and also by reasoning in terms of some target degree of robustness. Preliminary experiments validate the practical efficiency of the proposed approach.

From Robustness to Explainability and Back Again

TL;DR

with distance threshold

, enabling the use of robustness tools as oracles to compute explanations. A duality between AXp and CXp, via minimal hitting sets, is extended to the distance-restricted case, and two practical algorithms (linear-search and QuickXplain-based) are proposed to compute distance-restricted AXp/CXp using robustness oracles. Experiments on ACAS Xu DNNs with hundreds of ReLU units demonstrate substantial scalability gains, showing that formal explanations can be obtained for larger models by leveraging robustness tooling, with implications for a broad range of ML classifiers.

Abstract

Paper Structure (22 sections, 9 theorems, 12 equations, 5 figures, 5 tables, 3 algorithms)

This paper contains 22 sections, 9 theorems, 12 equations, 5 figures, 5 tables, 3 algorithms.

Introduction
Preliminaries
Minimal hitting sets.
Norm $l_{p}$.
Classification problems.
Robustness.
Logic-based explanations.
Distance-Restricted Explanations
Definitions.
Properties.
From Robustness to Explainability
Required properties of robustness tool.
Computing distance-restricted AXp's.
Relaxing property \ref{['it:robt:03']}.
Contrastive explanations & enumeration.
...and 7 more sections

Key Result

Proposition 1

Given an explanation problem ${\mathcal{E}}$, and norm $p$ and a value $\epsilon>0$ then,

Figures (5)

Figure 1: Runtime for ACASXU_1
Figure 2: Runtime for ACASXU_2
Figure 3: Runtime for ACASXU_3
Figure 4: Runtime for ACASXU_4
Figure 5: Runtime for ACASXU_5

Theorems & Definitions (20)

Example 1
Example 2
Example 3
Proposition 1: MHS Duality between AXp's and CXp's
Definition 1: Distance-restricted (W)AXp, $\epsilon$-(W)AXp
Definition 2: Distance-restricted (W)CXp, $\epsilon$-(W)CXp
Example 4
Remark 1
Proposition 2
Proposition 3
...and 10 more

From Robustness to Explainability and Back Again

TL;DR

Abstract

From Robustness to Explainability and Back Again

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (20)