Multiscale Tensor Summation Factorization as a New Neural Network Layer (MTS Layer) for Multidimensional Data Processing
Mehmet Yamaç, Muhammad Numan Yousaf, Serkan Kiranyaz, Moncef Gabbouj
TL;DR
The paper introduces Multiscale Tensor Summation (MTS) Factorization as a new backbone neural layer (the MTS Layer) for multidimensional data, enabling large receptive fields with far fewer parameters than dense layers. By performing patch-wise, multi-scale tensor sums and leveraging mode-product cores, MTSLayer reduces parameter counts and enhances optimization efficiency, outperforming dense, convolutional, and some transformer baselines in various tasks. The work also introduces the Multi-Head-Gate (MHG) non-linearity and builds MTSNet, which, when integrated with MHG, achieves favorable complexity-performance tradeoffs in image restoration tasks, often surpassing state-of-the-art methods with fewer parameters. The authors provide extensive experiments across classification, compression, and restoration, demonstrate training stability advantages over transformers, and release a PyTorch toolbox for implementing MTS-based networks.
Abstract
Multilayer perceptrons (MLP), or fully connected artificial neural networks, are known for performing vector-matrix multiplications using learnable weight matrices; however, their practical application in many machine learning tasks, especially in computer vision, can be limited due to the high dimensionality of input-output pairs at each layer. To improve efficiency, convolutional operators have been utilized to facilitate weight sharing and local connections, yet they are constrained by limited receptive fields. In this paper, we introduce Multiscale Tensor Summation (MTS) Factorization, a novel neural network operator that implements tensor summation at multiple scales, where each tensor to be summed is obtained through Tucker-decomposition-like mode products. Unlike other tensor decomposition methods in the literature, MTS is not introduced as a network compression tool; instead, as a new backbone neural layer. MTS not only reduces the number of parameters required while enhancing the efficiency of weight optimization compared to traditional dense layers (i.e., unfactorized weight matrices in MLP layers), but it also demonstrates clear advantages over convolutional layers. The proof-of-concept experimental comparison of the proposed MTS networks with MLPs and Convolutional Neural Networks (CNNs) demonstrates their effectiveness across various tasks, such as classification, compression, and signal restoration. Additionally, when integrated with modern non-linear units such as the multi-head gate (MHG), also introduced in this study, the corresponding neural network, MTSNet, demonstrates a more favorable complexity-performance tradeoff compared to state-of-the-art transformers in various computer vision applications. The software implementation of the MTS layer and the corresponding MTS-based networks, MTSNets, is shared at https://github.com/mehmetyamac/MTSNet.
