Spatial Information Bottleneck for Interpretable Visual Recognition

Kaixiang Shu; Kai Meng; Junqin Luo

Spatial Information Bottleneck for Interpretable Visual Recognition

Kaixiang Shu, Kai Meng, Junqin Luo

TL;DR

This work addresses the problem of spatial entanglement in deep visual models that hinders interpretability and robustness. It introduces Spatial Information Bottleneck (S-IB), an information-theoretic training framework that reshapes the Vector-Jacobian Product (VJP) structure to maximize foreground information and suppress background leakage, thereby sharpening gradient-based explanations. The theoretical core shows that VJPs on discriminative regions form a minimal sufficient statistic and that differentiable masking preserves sufficiency, enabling a tractable decomposition of the IB objective into foreground and background terms. Empirically, S-IB yields universal improvements across multiple explanation methods and datasets, enhances localization faithfulness (especially for CAM-based methods), and provides sharper, object-centric visualizations while delivering modest but consistent accuracy gains.

Abstract

Deep neural networks typically learn spatially entangled representations that conflate discriminative foreground features with spurious background correlations, thereby undermining model interpretability and robustness. We propose a novel understanding framework for gradient-based attribution from an information-theoretic perspective. We prove that, under mild conditions, the Vector-Jacobian Products (VJP) computed during backpropagation form minimal sufficient statistics of input features with respect to class labels. Motivated by this finding, we propose an encoding-decoding perspective : forward propagation encodes inputs into class space, while VJP in backpropagation decodes this encoding back to feature space. Therefore, we propose Spatial Information Bottleneck (S-IB) to spatially disentangle information flow. By maximizing mutual information between foreground VJP and inputs while minimizing mutual information in background regions, S-IB encourages networks to encode information only in class-relevant spatial regions. Since post-hoc explanation methods fundamentally derive from VJP computations, directly optimizing VJP's spatial structure during training improves visualization quality across diverse explanation paradigms. Experiments on five benchmarks demonstrate universal improvements across six explanation methods, achieving better foreground concentration and background suppression without method-specific tuning, alongside consistent classification accuracy gains.

Spatial Information Bottleneck for Interpretable Visual Recognition

TL;DR

Abstract

Spatial Information Bottleneck for Interpretable Visual Recognition

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)