Table of Contents
Fetching ...

Convolutional Dynamic Alignment Networks for Interpretable Classifications

Moritz Böhle, Mario Fritz, Bernt Schiele

TL;DR

The paper tackles the interpretability gap in deep networks by introducing Convolutional Dynamic Alignment Networks (CoDA-Nets), built from Dynamic Alignment Units (DAUs) that compute input-dependent linear transforms. This yields a faithful linear decomposition of outputs into input contributions, with weights that align to discriminative patterns, producing high-quality, model-inherent contribution maps. Empirically, CoDA-Nets achieve competitive accuracy on CIFAR-10 and TinyImagenet while offering superior attribution quality compared to standard post-hoc methods, aided by temperature scaling to modulate alignment pressure. The approach promises practical impact by enabling inherently interpretable classifiers with detailed, locality-sensitive explanations without substantial sacrifices in performance, and it points to scalable paths through shared components and efficient implementations in future work.

Abstract

We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA-Nets), which are performant classifiers with a high degree of inherent interpretability. Their core building blocks are Dynamic Alignment Units (DAUs), which linearly transform their input with weight vectors that dynamically align with task-relevant patterns. As a result, CoDA-Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions. Given the alignment of the DAUs, the resulting contribution maps align with discriminative input patterns. These model-inherent decompositions are of high visual quality and outperform existing attribution methods under quantitative metrics. Further, CoDA-Nets constitute performant classifiers, achieving on par results to ResNet and VGG models on e.g. CIFAR-10 and TinyImagenet.

Convolutional Dynamic Alignment Networks for Interpretable Classifications

TL;DR

The paper tackles the interpretability gap in deep networks by introducing Convolutional Dynamic Alignment Networks (CoDA-Nets), built from Dynamic Alignment Units (DAUs) that compute input-dependent linear transforms. This yields a faithful linear decomposition of outputs into input contributions, with weights that align to discriminative patterns, producing high-quality, model-inherent contribution maps. Empirically, CoDA-Nets achieve competitive accuracy on CIFAR-10 and TinyImagenet while offering superior attribution quality compared to standard post-hoc methods, aided by temperature scaling to modulate alignment pressure. The approach promises practical impact by enabling inherently interpretable classifiers with detailed, locality-sensitive explanations without substantial sacrifices in performance, and it points to scalable paths through shared components and efficient implementations in future work.

Abstract

We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA-Nets), which are performant classifiers with a high degree of inherent interpretability. Their core building blocks are Dynamic Alignment Units (DAUs), which linearly transform their input with weight vectors that dynamically align with task-relevant patterns. As a result, CoDA-Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions. Given the alignment of the DAUs, the resulting contribution maps align with discriminative input patterns. These model-inherent decompositions are of high visual quality and outperform existing attribution methods under quantitative metrics. Further, CoDA-Nets constitute performant classifiers, achieving on par results to ResNet and VGG models on e.g. CIFAR-10 and TinyImagenet.

Paper Structure

This paper contains 22 sections, 27 equations, 19 figures, 3 tables.

Figures (19)

  • Figure 1: Sketch of a 9-layer CoDA-Net, which computes its output $\mathbf{a_9}$ for an input $\mathbf{a_0}$ as a linear transform via a matrix $\mathbf{W_{0\rightarrow9}(\mathbf{a}_0)}$, such that the output can be linearly decomposed into input contributions (see right). $\mathbf{W_{0\rightarrow9}}$ is computed successively via multiple layers of Dynamic Alignment Units (DAUs), which produce matrices $\mathbf{W}_l$ that align with their respective inputs $\mathbf{a}_{l-1}$. As a result, the combined matrix $\mathbf{W_{0\rightarrow9}}$ aligns well with task-relevant patterns. Positive (negative) contributions for the class 'goldfinch' are shown in red (blue).
  • Figure 2: For different inputs $\mathbf{x}$, we visualise the linear weights and contributions (for the single layer, see eq. \ref{['eq:contrib_1']}, for the CoDA-Net eq. \ref{['eq:contrib']}) for the ground truth label $l$ and the strongest non-label output $z$. As can be seen, the weights align well with the input images. The first three rows are based on a single DAU layer, the last three on a 5 layer CoDA-Net. The first two samples (rows) per model are correctly classified and the last one is misclassified.
  • Figure 3: Eigenvectors (EVs) of $\mathbf{AB}$ after maximising the output of a rank-3 DAU over a set of noisy samples of 3 MNIST digits. Effectively, the DAUs encode the most frequent components in their EVs, similar to a principal component analysis (PCA).
  • Figure 4: By lowering the upper bound (cf. eq. \ref{['eq:bound']}), the correlation maximisation in the DAUs can be emphasised. We show contribution maps for a model trained with different temperatures.
  • Figure 5: Model-inherent contribution maps for the most confident predictions for 18 different classes, sorted by confidence (high to low). We show positive (negative) contributions (eq. \ref{['eq:contrib']}) per spatial location for the ground truth class logit in red (blue).
  • ...and 14 more figures