Table of Contents
Fetching ...

Interpretability Beyond Classification Output: Semantic Bottleneck Networks

Max Losch, Mario Fritz, Bernt Schiele

TL;DR

The paper addresses the interpretability gap in deep networks by introducing Semantic Bottleneck Networks (SBNs), which insert a semantically meaningful bottleneck layer that governs downstream predictions. By mapping intermediate representations to a fixed set of semantic concepts, the SB-Layer enables direct inspection of the evidence used for decisions while maintaining competitive performance on street-scene segmentation. Through a Cityscapes case study using Broden+ concepts, the authors demonstrate near state-of-the-art results with a dramatically reduced feature dimensionality and show how SB activations reveal error modes, support manipulation experiments, and enable confidence estimation. Extensive experiments on concept selection, semantic losslessness, error analysis, and confidence modeling illustrate the diagnostic value and practical potential of inherently interpretable deep architectures.

Abstract

Today's deep learning systems deliver high performance based on end-to-end training. While they deliver strong performance, these systems are hard to interpret. To address this issue, we propose Semantic Bottleneck Networks (SBN): deep networks with semantically interpretable intermediate layers that all downstream results are based on. As a consequence, the analysis on what the final prediction is based on is transparent to the engineer and failure cases and modes can be analyzed and avoided by high-level reasoning. We present a case study on street scene segmentation to demonstrate the feasibility and power of SBN. In particular, we start from a well performing classic deep network which we adapt to house a SB-Layer containing task related semantic concepts (such as object-parts and materials). Importantly, we can recover state of the art performance despite a drastic dimensionality reduction from 1000s (non-semantic feature) to 10s (semantic concept) channels. Additionally we show how the activations of the SB-Layer can be used for both the interpretation of failure cases of the network as well as for confidence prediction of the resulting output. For the first time, e.g., we show interpretable segmentation results for most predictions at over 99% accuracy.

Interpretability Beyond Classification Output: Semantic Bottleneck Networks

TL;DR

The paper addresses the interpretability gap in deep networks by introducing Semantic Bottleneck Networks (SBNs), which insert a semantically meaningful bottleneck layer that governs downstream predictions. By mapping intermediate representations to a fixed set of semantic concepts, the SB-Layer enables direct inspection of the evidence used for decisions while maintaining competitive performance on street-scene segmentation. Through a Cityscapes case study using Broden+ concepts, the authors demonstrate near state-of-the-art results with a dramatically reduced feature dimensionality and show how SB activations reveal error modes, support manipulation experiments, and enable confidence estimation. Extensive experiments on concept selection, semantic losslessness, error analysis, and confidence modeling illustrate the diagnostic value and practical potential of inherently interpretable deep architectures.

Abstract

Today's deep learning systems deliver high performance based on end-to-end training. While they deliver strong performance, these systems are hard to interpret. To address this issue, we propose Semantic Bottleneck Networks (SBN): deep networks with semantically interpretable intermediate layers that all downstream results are based on. As a consequence, the analysis on what the final prediction is based on is transparent to the engineer and failure cases and modes can be analyzed and avoided by high-level reasoning. We present a case study on street scene segmentation to demonstrate the feasibility and power of SBN. In particular, we start from a well performing classic deep network which we adapt to house a SB-Layer containing task related semantic concepts (such as object-parts and materials). Importantly, we can recover state of the art performance despite a drastic dimensionality reduction from 1000s (non-semantic feature) to 10s (semantic concept) channels. Additionally we show how the activations of the SB-Layer can be used for both the interpretation of failure cases of the network as well as for confidence prediction of the resulting output. For the first time, e.g., we show interpretable segmentation results for most predictions at over 99% accuracy.

Paper Structure

This paper contains 18 sections, 12 figures, 6 tables.

Figures (12)

  • Figure 1: Semantic Bottleneck Network (SBN). While the semantics of features in traditional architectures are unknown, our proposed SBN lends itself to interpretability beyond the output.
  • Figure 2: Construction of SBNs. 1. Start off with a well performing model on the target task. 2. Train a function (SB) that maps intermediate representations to semantic concepts. 3. Insert the SB back into the original model and finetune all downstream layers.
  • Figure 3: Populating the SB space to find modes of errors. The gray boxes enclosed in the SB indicate the receptive field of the classifier.
  • Figure 4: Sample from Broden+ dataset with annotations for parts (2nd row) and materials (3rd).
  • Figure 5: Task relevant concepts outperform irrelevant ones.
  • ...and 7 more figures