Table of Contents
Fetching ...

Towards the Characterization of Representations Learned via Capsule-based Network Architectures

Saja Tawalbeh, José Oramas

TL;DR

The paper tackles the interpretability of Capsule Networks by proposing a principled framework to test whether part-whole relationships are encoded in CapsNet representations. It introduces two methods—Perturbation Analysis and Layer-Wise Relevant Unit Selection—to probe internal activations and activation paths from input to prediction, applying them across MNIST, SVHN, CIFAR-10, and CelebA variants with CapsNetSF17 and CapsNetEM backbones. The study finds that CapsNet representations are not as disentangled nor strictly aligned with part-whole structures as often claimed, with activation-space perturbations revealing entangled feature dimensions and low overlap between part and whole activations measured by Relevance Mass Accuracy (RMA). These results highlight both the potential and the limitations of CapsNets for interpretable representation learning, and point to future work on routing efficiency, broader backbone evaluation, and complementary explanation methods to better understandCapsNet behavior in complex settings.

Abstract

Capsule Networks (CapsNets) have been re-introduced as a more compact and interpretable alternative to standard deep neural networks. While recent efforts have proved their compression capabilities, to date, their interpretability properties have not been fully assessed. Here, we conduct a systematic and principled study towards assessing the interpretability of these types of networks. Moreover, we pay special attention towards analyzing the level to which part-whole relationships are indeed encoded within the learned representation. Our analysis in the MNIST, SVHN, PASCAL-part and CelebA datasets suggest that the representations encoded in CapsNets might not be as disentangled nor strictly related to parts-whole relationships as is commonly stated in the literature.

Towards the Characterization of Representations Learned via Capsule-based Network Architectures

TL;DR

The paper tackles the interpretability of Capsule Networks by proposing a principled framework to test whether part-whole relationships are encoded in CapsNet representations. It introduces two methods—Perturbation Analysis and Layer-Wise Relevant Unit Selection—to probe internal activations and activation paths from input to prediction, applying them across MNIST, SVHN, CIFAR-10, and CelebA variants with CapsNetSF17 and CapsNetEM backbones. The study finds that CapsNet representations are not as disentangled nor strictly aligned with part-whole structures as often claimed, with activation-space perturbations revealing entangled feature dimensions and low overlap between part and whole activations measured by Relevance Mass Accuracy (RMA). These results highlight both the potential and the limitations of CapsNets for interpretable representation learning, and point to future work on routing efficiency, broader backbone evaluation, and complementary explanation methods to better understandCapsNet behavior in complex settings.

Abstract

Capsule Networks (CapsNets) have been re-introduced as a more compact and interpretable alternative to standard deep neural networks. While recent efforts have proved their compression capabilities, to date, their interpretability properties have not been fully assessed. Here, we conduct a systematic and principled study towards assessing the interpretability of these types of networks. Moreover, we pay special attention towards analyzing the level to which part-whole relationships are indeed encoded within the learned representation. Our analysis in the MNIST, SVHN, PASCAL-part and CelebA datasets suggest that the representations encoded in CapsNets might not be as disentangled nor strictly related to parts-whole relationships as is commonly stated in the literature.
Paper Structure (13 sections, 2 equations, 19 figures, 4 tables, 3 algorithms)

This paper contains 13 sections, 2 equations, 19 figures, 4 tables, 3 algorithms.

Figures (19)

  • Figure 1: Capsule network architecture from SF17. The network consists of 3 layers; Conv layer, PC layer, and CC layer and a decoder.
  • Figure 2: An overview of the computation of the interval $\alpha$, applied to each layer. This interval is utilized in our perturbation analysis (Section 4.1).
  • Figure 3: Overview of the proposed path estimation methodology. The input is passed through a trained CapsNet to obtain activations from different layers. In the top left, the original CapsNet is illustrated. The top right depicts the process of selecting relevant units within the Conv layer (blue rectangle). The bottom left outlines the selection of relevant units between the PC and CC layers (red rectangle). Finally, the bottom right shows the reduced CapsNet, through which each input is processed for further analysis (Section 4.2).
  • Figure 4: Class reconstructions with different architectures on MNIST and SVHN datasets. For comparison, we present the input sample $x_s$ data in the first row and the reconstruction results (prediction) in the rest of the rows. Even with different architectures and training settings, all models could embed different properties of the input digits keeping only important details from both datasets.
  • Figure 5: Qualitative examples of reconstructed inputs as the vector $v_j$ is perturbed based on the overarching ranges (the interval $\alpha$) across classes. It is noticeable in some cases, multiple visual features are modified at the same time via the perturbation of a single dimension of the vector of the active class capsule on both CapsNet architectures.
  • ...and 14 more figures