Table of Contents
Fetching ...

ProtoCaps: A Fast and Non-Iterative Capsule Network Routing Method

Miles Everett, Mingjun Zhong, Georgios Leontidis

TL;DR

Capsule Networks suffer from slow, memory-heavy iterative routing that hinders scalability. ProtoCaps removes iterations by employing a trainable prototype-based routing in a shared subspace, achieving substantial memory and FLOP reductions while maintaining competitive accuracy. Extensive experiments across MNIST, FashionMNIST, CIFAR-10, SmallNORB, and Imagewoof demonstrate favorable efficiency-accuracy trade-offs, with notable gains in viewpoint generalization and scalability on challenging data. The work suggests promising directions for scaling Capsule Networks through refined prototype initialization, deeper architectures, and further efficiency gains.

Abstract

Capsule Networks have emerged as a powerful class of deep learning architectures, known for robust performance with relatively few parameters compared to Convolutional Neural Networks (CNNs). However, their inherent efficiency is often overshadowed by their slow, iterative routing mechanisms which establish connections between Capsule layers, posing computational challenges resulting in an inability to scale. In this paper, we introduce a novel, non-iterative routing mechanism, inspired by trainable prototype clustering. This innovative approach aims to mitigate computational complexity, while retaining, if not enhancing, performance efficacy. Furthermore, we harness a shared Capsule subspace, negating the need to project each lower-level Capsule to each higher-level Capsule, thereby significantly reducing memory requisites during training. Our approach demonstrates superior results compared to the current best non-iterative Capsule Network and tests on the Imagewoof dataset, which is too computationally demanding to handle efficiently by iterative approaches. Our findings underscore the potential of our proposed methodology in enhancing the operational efficiency and performance of Capsule Networks, paving the way for their application in increasingly complex computational scenarios. Code is available at https://github.com/mileseverett/ProtoCaps.

ProtoCaps: A Fast and Non-Iterative Capsule Network Routing Method

TL;DR

Capsule Networks suffer from slow, memory-heavy iterative routing that hinders scalability. ProtoCaps removes iterations by employing a trainable prototype-based routing in a shared subspace, achieving substantial memory and FLOP reductions while maintaining competitive accuracy. Extensive experiments across MNIST, FashionMNIST, CIFAR-10, SmallNORB, and Imagewoof demonstrate favorable efficiency-accuracy trade-offs, with notable gains in viewpoint generalization and scalability on challenging data. The work suggests promising directions for scaling Capsule Networks through refined prototype initialization, deeper architectures, and further efficiency gains.

Abstract

Capsule Networks have emerged as a powerful class of deep learning architectures, known for robust performance with relatively few parameters compared to Convolutional Neural Networks (CNNs). However, their inherent efficiency is often overshadowed by their slow, iterative routing mechanisms which establish connections between Capsule layers, posing computational challenges resulting in an inability to scale. In this paper, we introduce a novel, non-iterative routing mechanism, inspired by trainable prototype clustering. This innovative approach aims to mitigate computational complexity, while retaining, if not enhancing, performance efficacy. Furthermore, we harness a shared Capsule subspace, negating the need to project each lower-level Capsule to each higher-level Capsule, thereby significantly reducing memory requisites during training. Our approach demonstrates superior results compared to the current best non-iterative Capsule Network and tests on the Imagewoof dataset, which is too computationally demanding to handle efficiently by iterative approaches. Our findings underscore the potential of our proposed methodology in enhancing the operational efficiency and performance of Capsule Networks, paving the way for their application in increasingly complex computational scenarios. Code is available at https://github.com/mileseverett/ProtoCaps.
Paper Structure (24 sections, 9 equations, 5 figures, 5 tables)

This paper contains 24 sections, 9 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Comparative analysis of test accuracy vs Floating Point Operations (FLOPs, as calculated by the FVCore library fvcore2023) for various methods on two datasets. a) the MNIST dataset lecun2010mnist, included for sanity to show our method is comparable in accuracy to other methods. b) the Imagewoof dataset Howard_Imagewoof_2019, showing that we are able to outperform SR Caps and our ResNet18 baseline on a more difficult dataset in terms of image size. Unfortunately the iterative methods require too much gpu memory (>80GB) in order to process the Imagewoof dataset at comparable sizes. A big point to note is that while SRCaps scales by around 20 times in FLOP count when processing the larger feature maps of Imagewoof compared to MNIST, our model's increase is only about 4 times. An efficient method, in terms of low FLOPs and high performance, would be in the top left corner. Note the logarithmic scale for the x-axis in the Imagewoof plot.
  • Figure 2: Diagram showing our proposed routing algorithm, i.e., ProtoCaps. Multiple layers of this ProtoCaps algorithm can be stacked in order to create a ProtoCaps Network. $\oplus$ and $\otimes$ denote elementwise addition and multiplication. We also include a residual connection from $Pose_{i}$ to the output of $\text{MLP}_{out}$ as discussed in section 3.3. Figure \ref{['fig:network_architecture']} shows the end to end ProtoCaps network.
  • Figure 3: Diagram illustrating the intuition behind prototype routing. We have three prototypes denoted in red. Our pose matrix embeddings are shown in grey. For simplicity, we assume that an embedding represents an entire image and is derived directly from the image rather than the feature maps of the convolutional backbone. The dog images are sourced from the Imagewoof validation set Howard_Imagewoof_2019.
  • Figure 4: Diagram illustrating the end to end structure of our ProtoCaps network, including tensor dimensions for clarity. Note that the Class Capsule layer (the Capsule layer which corresponds to the classification prediction) is incorporated within the Non-Iterative Prototype Routing loop (i.e., ProtoCaps, see Figure \ref{['fig:routing_pic']}). This functions in a similar manner to every other ProtoCaps layer except for its number of Capsules equates to the number of classes rather than being a hyperparameter of the network design. The dog image used is sourced from the Imagewoof validation set Howard_Imagewoof_2019.
  • Figure 5: Average activations of each Capsule unit in the Convolutional Capsules (ConvCaps) layer of a 3 layer Capsule Network (Primary Caps, Conv Caps, Class Caps) trained on the smallNORB dataset lecun2004learning. The amount that each circle is filled shows how much over the entire test set for each class the Capsule is activated on average e.g. a circle with no fill represents a Capsule which was never activated for this class, while a circle with full fill represents a Capsule which is always activated. Each row shows a different class for easy comparison.