Table of Contents
Fetching ...

TensorProjection Layer: A Tensor-Based Dimension Reduction Method in Deep Neural Networks

Toshinari Morimoto, Su-Yun Huang

TL;DR

Numerical experiments indicate that the proposed dimension reduction method can outperform traditional downsampling methods, such as pooling layers, in the authors' tasks, suggesting it as a promising alternative for feature summarization.

Abstract

In this paper, we propose a dimension reduction method specifically designed for tensor-structured feature data in deep neural networks. The method is implemented as a hidden layer, called the TensorProjection layer, which transforms input tensors into output tensors with reduced dimensions through mode-wise projections. The projection directions are treated as model parameters of the layer and are optimized during model training. Our method can serve as an alternative to pooling layers for summarizing image data, or to convolutional layers as a technique for reducing the number of channels. We conduct experiments on tasks such as medical image classification and segmentation, integrating the TensorProjection layer into commonly used baseline architectures to evaluate its effectiveness. Numerical experiments indicate that the proposed method can outperform traditional downsampling methods, such as pooling layers, in our tasks, suggesting it as a promising alternative for feature summarization.

TensorProjection Layer: A Tensor-Based Dimension Reduction Method in Deep Neural Networks

TL;DR

Numerical experiments indicate that the proposed dimension reduction method can outperform traditional downsampling methods, such as pooling layers, in the authors' tasks, suggesting it as a promising alternative for feature summarization.

Abstract

In this paper, we propose a dimension reduction method specifically designed for tensor-structured feature data in deep neural networks. The method is implemented as a hidden layer, called the TensorProjection layer, which transforms input tensors into output tensors with reduced dimensions through mode-wise projections. The projection directions are treated as model parameters of the layer and are optimized during model training. Our method can serve as an alternative to pooling layers for summarizing image data, or to convolutional layers as a technique for reducing the number of channels. We conduct experiments on tasks such as medical image classification and segmentation, integrating the TensorProjection layer into commonly used baseline architectures to evaluate its effectiveness. Numerical experiments indicate that the proposed method can outperform traditional downsampling methods, such as pooling layers, in our tasks, suggesting it as a promising alternative for feature summarization.

Paper Structure

This paper contains 41 sections, 1 theorem, 50 equations, 24 figures, 6 tables.

Key Result

Proposition 1

Let $\{\mathcal{X}_i\}_{i=1}^n$ be the input data to the TensorProjection layer, and let $\{\mathcal{Z}_i\}_{i=1}^n$ be the output as defined by Equation (eq:forward_propagation). The gradients required for backpropagation are calculated as follows: In Equation (eq:dLdvecWk), the two remaining derivatives, $\frac{\partial \mathrm{vec}(\mathcal{Z}_i)}{\partial \mathrm{vec}(U_k)^\top}$ and $\frac{\

Figures (24)

  • Figure 1: Retinal OCT image classification: validation data loss over training epochs. Each point represents the median loss value from 30 repeated runs. The TPL models converge faster than the baseline model, but after epoch 12, the effects of overfitting become more apparent in the TPL models.
  • Figure 2: Retinal OCT image classification: validation data accuracy over training epochs. The TPL models increase validation accuracy more quickly compared to the baseline model, but eventually, all models reach nearly the same performance level.
  • Figure 3: Retinal OCT image classification: weighted F1 score for validation data over training epochs. The TPL models improve the F1 score more quickly compared to the baseline model, but eventually, all models converge to nearly the same performance level, as observed with validation accuracy.
  • Figure 4: Chest X-ray image classification: median validation loss per epoch for baseline and TPL models, calculated from repeated runs. The shaded range represents the 25th to 75th percentiles. The validation loss for the TPL models remains consistently lower than that of the baseline model throughout the training.
  • Figure 5: Chest X-ray image classification: validation median accuracy per epoch for Baseline and TPL models, calculated from repeated runs. The shaded range represents the 25th to 75th percentiles. Throughout the epochs, the TPL models show slightly better performance compared to the baseline model.
  • ...and 19 more figures

Theorems & Definitions (1)

  • Proposition 1: Gradients related to the TensorProjection layer