Table of Contents
Fetching ...

Convolutional Fully-Connected Capsule Network (CFC-CapsNet): A Novel and Fast Capsule Network

Pouya Shiri, Amirali Baniasadi

TL;DR

This paper addresses the inefficiency of Capsule Networks on complex datasets by introducing the Convolutional Fully-Connected Capsule Network (CFC-CapsNet). It replaces the conventional primary capsule construction with a Convolutional Fully-Connected (CFC) layer that yields significantly fewer capsules yet more expressive representations, resulting in faster training and inference and fewer parameters. The approach, combined with an enhanced class-independent decoder and regularization techniques (capsule dropout and hard training), achieves competitive accuracy on CIFAR-10, SVHN, and Fashion-MNIST, while dramatically reducing parameters (roughly 30% fewer on FMNIST and ~50% on CIFAR-10/SVHN) and speeding up computation (approximately 4x training and 4.5x inference). The work also analyzes parameter sensitivity, robustness to affine transformations, and positions CFC-CapsNet as a practical, lighter variant of CapsNet suitable for real-world applications, with potential extensions to other CapsNet architectures.

Abstract

A Capsule Network (CapsNet) is a relatively new classifier and one of the possible successors of Convolutional Neural Networks (CNNs). CapsNet maintains the spatial hierarchies between the features and outperforms CNNs at classifying images including overlapping categories. Even though CapsNet works well on small-scale datasets such as MNIST, it fails to achieve a similar level of performance on more complicated datasets and real applications. In addition, CapsNet is slow compared to CNNs when performing the same task and relies on a higher number of parameters. In this work, we introduce Convolutional Fully-Connected Capsule Network (CFC-CapsNet) to address the shortcomings of CapsNet by creating capsules using a different method. We introduce a new layer (CFC layer) as an alternative solution to creating capsules. CFC-CapsNet produces fewer, yet more powerful capsules resulting in higher network accuracy. Our experiments show that CFC-CapsNet achieves competitive accuracy, faster training and inference and uses less number of parameters on the CIFAR-10, SVHN and Fashion-MNIST datasets compared to conventional CapsNet.

Convolutional Fully-Connected Capsule Network (CFC-CapsNet): A Novel and Fast Capsule Network

TL;DR

This paper addresses the inefficiency of Capsule Networks on complex datasets by introducing the Convolutional Fully-Connected Capsule Network (CFC-CapsNet). It replaces the conventional primary capsule construction with a Convolutional Fully-Connected (CFC) layer that yields significantly fewer capsules yet more expressive representations, resulting in faster training and inference and fewer parameters. The approach, combined with an enhanced class-independent decoder and regularization techniques (capsule dropout and hard training), achieves competitive accuracy on CIFAR-10, SVHN, and Fashion-MNIST, while dramatically reducing parameters (roughly 30% fewer on FMNIST and ~50% on CIFAR-10/SVHN) and speeding up computation (approximately 4x training and 4.5x inference). The work also analyzes parameter sensitivity, robustness to affine transformations, and positions CFC-CapsNet as a practical, lighter variant of CapsNet suitable for real-world applications, with potential extensions to other CapsNet architectures.

Abstract

A Capsule Network (CapsNet) is a relatively new classifier and one of the possible successors of Convolutional Neural Networks (CNNs). CapsNet maintains the spatial hierarchies between the features and outperforms CNNs at classifying images including overlapping categories. Even though CapsNet works well on small-scale datasets such as MNIST, it fails to achieve a similar level of performance on more complicated datasets and real applications. In addition, CapsNet is slow compared to CNNs when performing the same task and relies on a higher number of parameters. In this work, we introduce Convolutional Fully-Connected Capsule Network (CFC-CapsNet) to address the shortcomings of CapsNet by creating capsules using a different method. We introduce a new layer (CFC layer) as an alternative solution to creating capsules. CFC-CapsNet produces fewer, yet more powerful capsules resulting in higher network accuracy. Our experiments show that CFC-CapsNet achieves competitive accuracy, faster training and inference and uses less number of parameters on the CIFAR-10, SVHN and Fashion-MNIST datasets compared to conventional CapsNet.

Paper Structure

This paper contains 23 sections, 8 equations, 13 figures, 6 tables, 1 algorithm.

Figures (13)

  • Figure 1: The architecture of CapsNet. After extracting low-level features using two convolutional layers, the output is reshaped to vectors. These vectors are multiplied by a matrix to create the primary capsules which constitute the basic unit of CapsNet. The output capsules are inferred using the dynamic routing algorithm, and finally the decoder reconstructs the input image.
  • Figure 2: The low-level feature extractor in CapsNet. There are two convolutional layers. The output is reshaped to vectors.
  • Figure 3: CapsNet Decoder network. The image is reconstructed using subsequent FC layers. Note that only the correct capsule (the capsule with the correct prediction) is kept and the rest of the capsules are masked with zeros.
  • Figure 4: Class-Independent Decoder for CapsNet.
  • Figure 5: CFC Layer. In order to provide a summary of vectors, all the vectors corresponding to spatially correlated neurons are grouped together and fed to different FC layers. Kernel size (K) and output dimensionality (D) are shown in the Figure.
  • ...and 8 more figures