Table of Contents
Fetching ...

Feature Map Similarity Reduction in Convolutional Neural Networks

Zakariae Belmekki, Jun Li, Patrick Reuter, David Antonio Gómez Jáuregui, Karl Jenkins

TL;DR

This work targets redundancy in CNN feature maps by showing that kernel orthogonality does not guarantee reduced feature-map similarity. It derives the Convolutional Similarity loss $L_{CS}$, formalizing a kernel-based objective that minimizes cross-kernel similarity to drive feature-map orthogonality, with the key relation $\langle F_1, F_2\rangle = \langle (K_1 \circledast K_2), (X \circledast X)_{[1-N, N-1]}\rangle$. Empirical results on shallow CNNs and a ResNet18 demonstrate that minimizing $L_{CS}$ improves accuracy, accelerates convergence, and enables much smaller models to achieve comparable performance. The approach provides a computationally efficient alternative to explicit feature-map decorrelation and kernel-only regularization, though it presents challenges when coupled with momentum-based optimizers. Future work will explore combining iterative initialization with momentum dynamics and extending CS to generative frameworks.

Abstract

It has been observed that Convolutional Neural Networks (CNNs) suffer from redundancy in feature maps, leading to inefficient capacity utilization. Efforts to address this issue have largely focused on kernel orthogonality method. In this work, we theoretically and empirically demonstrate that kernel orthogonality does not necessarily lead to a reduction in feature map redundancy. Based on this analysis, we propose the Convolutional Similarity method to reduce feature map similarity, independently of the CNN's input. The Convolutional Similarity can be minimized as either a regularization term or an iterative initialization method. Experimental results show that minimizing Convolutional Similarity not only improves classification accuracy but also accelerates convergence. Furthermore, our method enables the use of significantly smaller models to achieve the same level of performance, promoting a more efficient use of model capacity. Future work will focus on coupling the iterative initialization method with the optimization momentum term and examining the method's impact on generative frameworks.

Feature Map Similarity Reduction in Convolutional Neural Networks

TL;DR

This work targets redundancy in CNN feature maps by showing that kernel orthogonality does not guarantee reduced feature-map similarity. It derives the Convolutional Similarity loss , formalizing a kernel-based objective that minimizes cross-kernel similarity to drive feature-map orthogonality, with the key relation . Empirical results on shallow CNNs and a ResNet18 demonstrate that minimizing improves accuracy, accelerates convergence, and enables much smaller models to achieve comparable performance. The approach provides a computationally efficient alternative to explicit feature-map decorrelation and kernel-only regularization, though it presents challenges when coupled with momentum-based optimizers. Future work will explore combining iterative initialization with momentum dynamics and extending CS to generative frameworks.

Abstract

It has been observed that Convolutional Neural Networks (CNNs) suffer from redundancy in feature maps, leading to inefficient capacity utilization. Efforts to address this issue have largely focused on kernel orthogonality method. In this work, we theoretically and empirically demonstrate that kernel orthogonality does not necessarily lead to a reduction in feature map redundancy. Based on this analysis, we propose the Convolutional Similarity method to reduce feature map similarity, independently of the CNN's input. The Convolutional Similarity can be minimized as either a regularization term or an iterative initialization method. Experimental results show that minimizing Convolutional Similarity not only improves classification accuracy but also accelerates convergence. Furthermore, our method enables the use of significantly smaller models to achieve the same level of performance, promoting a more efficient use of model capacity. Future work will focus on coupling the iterative initialization method with the optimization momentum term and examining the method's impact on generative frameworks.

Paper Structure

This paper contains 13 sections, 20 equations, 5 figures, 9 tables, 2 algorithms.

Figures (5)

  • Figure 1: Convolutional Similarity loss curves for (a) the baseline, (b) $I=500$, and (c) $\beta=0.001$.
  • Figure 2: The classification loss evolution for CNN1 and CNN2 on a logarithmic scale.
  • Figure 3: The train accuracies of CNN1 and CNN2.
  • Figure 4: Accuracies of the ResNet18 model with different configurations, trained with and without Convolutional Similarity Minimization.
  • Figure 5: Test accuracy variation of ResNet18 model according to iterations $I$.