Table of Contents
Fetching ...

Optimising for Interpretability: Convolutional Dynamic Alignment Networks

Moritz Böhle, Mario Fritz, Bernt Schiele

TL;DR

CoDA Nets introduce Dynamic Alignment Units that output input-dependent linear transforms $o = \mathbf{w}(\vec{x})^\top \vec{x}$, producing model-inherent contribution maps that align with discriminative input patterns. The framework combines dynamic linearity with an alignment bias, implemented through bounded weight norms and efficient eDAU variants, enabling both high classification performance and faithful explanations. Empirical results show competitive accuracy on CIFAR-10 and TinyImagenet, superior detail in attribution maps compared to post-hoc methods, and the potential to create hybrid networks that increase interpretable depth while leveraging standard CNNs. Temperature scaling and embedding-depth interpolation offer practical knobs to trade off interpretability and accuracy, supporting scalable, interpretable vision models.

Abstract

We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA Nets), which are performant classifiers with a high degree of inherent interpretability. Their core building blocks are Dynamic Alignment Units (DAUs), which are optimised to transform their inputs with dynamically computed weight vectors that align with task-relevant patterns. As a result, CoDA Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions. Given the alignment of the DAUs, the resulting contribution maps align with discriminative input patterns. These model-inherent decompositions are of high visual quality and outperform existing attribution methods under quantitative metrics. Further, CoDA Nets constitute performant classifiers, achieving on par results to ResNet and VGG models on e.g. CIFAR-10 and TinyImagenet. Lastly, CoDA Nets can be combined with conventional neural network models to yield powerful classifiers that more easily scale to complex datasets such as Imagenet whilst exhibiting an increased interpretable depth, i.e., the output can be explained well in terms of contributions from intermediate layers within the network.

Optimising for Interpretability: Convolutional Dynamic Alignment Networks

TL;DR

CoDA Nets introduce Dynamic Alignment Units that output input-dependent linear transforms , producing model-inherent contribution maps that align with discriminative input patterns. The framework combines dynamic linearity with an alignment bias, implemented through bounded weight norms and efficient eDAU variants, enabling both high classification performance and faithful explanations. Empirical results show competitive accuracy on CIFAR-10 and TinyImagenet, superior detail in attribution maps compared to post-hoc methods, and the potential to create hybrid networks that increase interpretable depth while leveraging standard CNNs. Temperature scaling and embedding-depth interpolation offer practical knobs to trade off interpretability and accuracy, supporting scalable, interpretable vision models.

Abstract

We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA Nets), which are performant classifiers with a high degree of inherent interpretability. Their core building blocks are Dynamic Alignment Units (DAUs), which are optimised to transform their inputs with dynamically computed weight vectors that align with task-relevant patterns. As a result, CoDA Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions. Given the alignment of the DAUs, the resulting contribution maps align with discriminative input patterns. These model-inherent decompositions are of high visual quality and outperform existing attribution methods under quantitative metrics. Further, CoDA Nets constitute performant classifiers, achieving on par results to ResNet and VGG models on e.g. CIFAR-10 and TinyImagenet. Lastly, CoDA Nets can be combined with conventional neural network models to yield powerful classifiers that more easily scale to complex datasets such as Imagenet whilst exhibiting an increased interpretable depth, i.e., the output can be explained well in terms of contributions from intermediate layers within the network.

Paper Structure

This paper contains 19 sections, 12 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Sketch of a 9-layer CoDA-Net, which computes its output $\vec{a_9}$ for an input $\vec{a_0}$ as a linear transform via a matrix $\mat {W_{0\rightarrow9}(\vec{a}_0)}$. As such, the output can be linearly decomposed into input contributions (see right). This 'global' transformation matrix $\mat {w_{0\rightarrow9}}$ is computed successively via multiple layers of Dynamic Alignment Units (DAUs). These layers, in turn, produce intermediate linear transformation matrices $\mat w_l(\vec{a}_{l-1})$ that align with the inputs of layer $l$. As a result, the combined matrix $\mat {w_{0\rightarrow9}}$ also aligns well with task-relevant patterns. Positive (negative) contributions for the class 'goldfinch' are shown in red (blue).
  • Figure 2: For different inputs $\vec{x}$, we visualise the linear weights and contributions (for the single layer, see eq. \ref{['eq:contrib_1']}, for the CoDA-Net eq. \ref{['eq:contrib']}) for the ground truth label $l$ and the strongest non-label output $z$. As can be seen, the weights align well with the input images. The first three rows are based on a single DAU layer, the last three on a 5 layer CoDA-Net. The first two samples (rows) per model are correctly classified and the last one is misclassified.
  • Figure 3: Eigenvectors (EVs) of AB after maximising the output of a rank-3 DAU over a set of noisy samples of 3 MNIST digits. Effectively, the DAUs encode the most frequent components in their EVs, similar to a principal component analysis (PCA).
  • Figure 4: By lowering the upper bound (cf. eq. \ref{['eq:bound']}), the correlation maximisation in the DAUs can be emphasised. We show contribution maps for a model trained with different temperatures.
  • Figure 5: Model-inherent contribution maps for the most confident predictions for 18 different classes, sorted by confidence (high to low). We show positive (negative) contributions (eq. \ref{['eq:contrib']}) per spatial location for the ground truth class logit in red (blue).
  • ...and 6 more figures