TFDMNet: A Novel Network Structure Combines the Time Domain and Frequency Domain Features
Hengyue Pan, Yixin Chen, Zhiliang Tian, Peng Qiao, Linbo Qiao, Dongsheng Li
TL;DR
This work tackles the high computational cost of CNN convolutions by replacing spatial Conv operations with an Element-wise Multiplication Layer (EML) trained in the frequency domain, justified by the Cross-Correlation relationship $\mathcal{F}(R({\bf u},{\bf v})) = \mathcal{F}^*({\bf u}) \cdot \mathcal{F}({\bf v})$. It introduces Weight Fixation to bound parameter growth and extends Batch Normalization and Dropout to frequency-domain features via a two-branch real/imag design. To balance memory and compute, the Time-Frequency Domain Mixture Network (TFDMNet) uses shallow time-domain layers and deeper frequency-domain layers, with real and imaginary outputs fused at the end for classification. Experiments on MNIST, CIFAR-10, and ImageNet show reduced operation counts with competitive accuracy, validating the practicality of the proposed time–frequency hybrid approach for efficient vision models.
Abstract
Convolutional neural network (CNN) has achieved impressive success in computer vision during the past few decades. The image convolution operation helps CNNs to get good performance on image-related tasks. However, it also has high computation complexity and hard to be parallelized. This paper proposes a novel Element-wise Multiplication Layer (EML) to replace convolution layers, which can be trained in the frequency domain. Theoretical analyses show that EMLs lower the computation complexity and easier to be parallelized. Moreover, we introduce a Weight Fixation mechanism to alleviate the problem of over-fitting, and analyze the working behavior of Batch Normalization and Dropout in the frequency domain. To get the balance between the computation complexity and memory usage, we propose a new network structure, namely Time-Frequency Domain Mixture Network (TFDMNet), which combines the advantages of both convolution layers and EMLs. Experimental results imply that TFDMNet achieves good performance on MNIST, CIFAR-10 and ImageNet databases with less number of operations comparing with corresponding CNNs.
