Kronecker Product Feature Fusion for Convolutional Neural Network in Remote Sensing Scene Classification

Yinzhu Cheng

Kronecker Product Feature Fusion for Convolutional Neural Network in Remote Sensing Scene Classification

Yinzhu Cheng

TL;DR

This paper tackles remote sensing scene classification by introducing KPFF, a feature fusion method that unifies Add and Concat via the Kronecker product with learnable weights. The fusion is defined as $\mathbf{y} = \sum_{i=1}^{n} \mathbf{w}_i \otimes \mathbf{x}_i$, which degenerates to Concat when $\mathbf{w}_i = \mathbf{e}_i$ and to Add when $\mathbf{w}_i = \mathbf{e}_1$, enabling flexible, data-adaptive fusion. The authors provide backpropagation formulas for $\mathbf{w}_i$ and $\mathbf{x}_i$ to support end-to-end training, and analyze the time complexity. Experiments on the UC-Merced dataset with backbones AlexNet, VGGNet, and Inception show that KPFF yields higher accuracy than both the original networks and the Add/Concat baselines, demonstrating its effectiveness for remote sensing scene classification. The work highlights the practical impact of learnable, Kronecker-based fusion in improving CNN performance while maintaining competitive computational costs, and suggests future avenues including complexity reduction and integration with attention or coding-theoretic approaches.

Abstract

Remote Sensing Scene Classification is a challenging and valuable research topic, in which Convolutional Neural Network (CNN) has played a crucial role. CNN can extract hierarchical convolutional features from remote sensing imagery, and Feature Fusion of different layers can enhance CNN's performance. Two successful Feature Fusion methods, Add and Concat, are employed in certain state-of-the-art CNN algorithms. In this paper, we propose a novel Feature Fusion algorithm, which unifies the aforementioned methods using the Kronecker Product (KPFF), and we discuss the Backpropagation procedure associated with this algorithm. To validate the efficacy of the proposed method, a series of experiments are designed and conducted. The results demonstrate its effectiveness of enhancing CNN's accuracy in Remote sensing scene classification.

Kronecker Product Feature Fusion for Convolutional Neural Network in Remote Sensing Scene Classification

TL;DR

, which degenerates to Concat when

and to Add when

, enabling flexible, data-adaptive fusion. The authors provide backpropagation formulas for

and

to support end-to-end training, and analyze the time complexity. Experiments on the UC-Merced dataset with backbones AlexNet, VGGNet, and Inception show that KPFF yields higher accuracy than both the original networks and the Add/Concat baselines, demonstrating its effectiveness for remote sensing scene classification. The work highlights the practical impact of learnable, Kronecker-based fusion in improving CNN performance while maintaining competitive computational costs, and suggests future avenues including complexity reduction and integration with attention or coding-theoretic approaches.

Abstract

Paper Structure (13 sections, 13 equations, 3 figures, 1 table)

This paper contains 13 sections, 13 equations, 3 figures, 1 table.

Introduction
Preliminaries
Feature Fusion
Convolutional Neural Network (CNN)
Kronecker Product
METHOD
A Feature Fusion Method Based On Kronecker Product
The Backpropagation Procedure Of KPFF
EXPERIMENTS
Datasets
Environment and settings
Results and Analysis
Conclusion

Figures (3)

Figure 1: Add Feature Fusion
Figure 2: Concat Feature Fusion
Figure 3: Kronecker Product Feature Fusion (KPFF)

Kronecker Product Feature Fusion for Convolutional Neural Network in Remote Sensing Scene Classification

TL;DR

Abstract

Kronecker Product Feature Fusion for Convolutional Neural Network in Remote Sensing Scene Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (3)