Table of Contents
Fetching ...

Spatial Information Bottleneck for Interpretable Visual Recognition

Kaixiang Shu, Kai Meng, Junqin Luo

TL;DR

This work addresses the problem of spatial entanglement in deep visual models that hinders interpretability and robustness. It introduces Spatial Information Bottleneck (S-IB), an information-theoretic training framework that reshapes the Vector-Jacobian Product (VJP) structure to maximize foreground information and suppress background leakage, thereby sharpening gradient-based explanations. The theoretical core shows that VJPs on discriminative regions form a minimal sufficient statistic and that differentiable masking preserves sufficiency, enabling a tractable decomposition of the IB objective into foreground and background terms. Empirically, S-IB yields universal improvements across multiple explanation methods and datasets, enhances localization faithfulness (especially for CAM-based methods), and provides sharper, object-centric visualizations while delivering modest but consistent accuracy gains.

Abstract

Deep neural networks typically learn spatially entangled representations that conflate discriminative foreground features with spurious background correlations, thereby undermining model interpretability and robustness. We propose a novel understanding framework for gradient-based attribution from an information-theoretic perspective. We prove that, under mild conditions, the Vector-Jacobian Products (VJP) computed during backpropagation form minimal sufficient statistics of input features with respect to class labels. Motivated by this finding, we propose an encoding-decoding perspective : forward propagation encodes inputs into class space, while VJP in backpropagation decodes this encoding back to feature space. Therefore, we propose Spatial Information Bottleneck (S-IB) to spatially disentangle information flow. By maximizing mutual information between foreground VJP and inputs while minimizing mutual information in background regions, S-IB encourages networks to encode information only in class-relevant spatial regions. Since post-hoc explanation methods fundamentally derive from VJP computations, directly optimizing VJP's spatial structure during training improves visualization quality across diverse explanation paradigms. Experiments on five benchmarks demonstrate universal improvements across six explanation methods, achieving better foreground concentration and background suppression without method-specific tuning, alongside consistent classification accuracy gains.

Spatial Information Bottleneck for Interpretable Visual Recognition

TL;DR

This work addresses the problem of spatial entanglement in deep visual models that hinders interpretability and robustness. It introduces Spatial Information Bottleneck (S-IB), an information-theoretic training framework that reshapes the Vector-Jacobian Product (VJP) structure to maximize foreground information and suppress background leakage, thereby sharpening gradient-based explanations. The theoretical core shows that VJPs on discriminative regions form a minimal sufficient statistic and that differentiable masking preserves sufficiency, enabling a tractable decomposition of the IB objective into foreground and background terms. Empirically, S-IB yields universal improvements across multiple explanation methods and datasets, enhances localization faithfulness (especially for CAM-based methods), and provides sharper, object-centric visualizations while delivering modest but consistent accuracy gains.

Abstract

Deep neural networks typically learn spatially entangled representations that conflate discriminative foreground features with spurious background correlations, thereby undermining model interpretability and robustness. We propose a novel understanding framework for gradient-based attribution from an information-theoretic perspective. We prove that, under mild conditions, the Vector-Jacobian Products (VJP) computed during backpropagation form minimal sufficient statistics of input features with respect to class labels. Motivated by this finding, we propose an encoding-decoding perspective : forward propagation encodes inputs into class space, while VJP in backpropagation decodes this encoding back to feature space. Therefore, we propose Spatial Information Bottleneck (S-IB) to spatially disentangle information flow. By maximizing mutual information between foreground VJP and inputs while minimizing mutual information in background regions, S-IB encourages networks to encode information only in class-relevant spatial regions. Since post-hoc explanation methods fundamentally derive from VJP computations, directly optimizing VJP's spatial structure during training improves visualization quality across diverse explanation paradigms. Experiments on five benchmarks demonstrate universal improvements across six explanation methods, achieving better foreground concentration and background suppression without method-specific tuning, alongside consistent classification accuracy gains.

Paper Structure

This paper contains 18 sections, 27 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: (a) During standard training, foreground mutual information (MI) exhibits an increasing trend while background MI gradually decreases, despite small magnitudes. (b) Post-hoc explanation comparisons show that models trained without S-IB produce diffuse attention highlighting only coarse object regions, while our method yields sharper, object-centric visualizations with clearer boundaries.
  • Figure 2: The flowchart of S-IB. The backbone model processes input image X to produce VJP decoding $R(X)$.
  • Figure 3: Qualitative analysis of learned attention patterns using Guided Backpropagation on ResNet-50.The forth row shows the difference between our method and the baseline: red indicates positive values, and blue indicates negative values.
  • Figure 4: Qualitative comparison of post-hoc explanation methods before (top row) and after (bottom row) applying our training framework.
  • Figure 5: Mutual information measurements between features and representations on fine-grained datasets.