Table of Contents
Fetching ...

A Proper Orthogonal Decomposition approach for parameters reduction of Single Shot Detector networks

Laura Meneghetti, Nicola Demo, Gianluigi Rozza

TL;DR

The paper tackles the challenge of deploying high-accuracy object detectors like SSD300 on resource-constrained devices by introducing a POD-based dimensionality reduction layer. It splits the base network at a chosen cut-off layer $l$, projects the high-dimensional pre-model activations $\mathbf{x}^{(l)}$ onto a low-dimensional POD subspace via $\mathbf{z}^i=\\mathbf{\\Psi}_r^T\mathbf{x}^{(l,i)}$, and connects this reduced representation to the original predictor, with priors reduced from $8732$ to $5782$. Empirical results show that the reduced network achieves substantial gains in memory and training time (e.g., memory down ~15–22% and training time halved) but incurs notable accuracy loss (e.g., $mAP$ drops from $77.8\%$ to $39\%$ on VOC and from $70.2\%$ to $59\%$ on a cat-dog subset). The work highlights a practical trade-off between compression and detection performance and suggests avenues like hyperreduction and automatic cutoff-layer selection to further improve efficiency in real-world deployments.

Abstract

As a major breakthrough in artificial intelligence and deep learning, Convolutional Neural Networks have achieved an impressive success in solving many problems in several fields including computer vision and image processing. Real-time performance, robustness of algorithms and fast training processes remain open problems in these contexts. In addition object recognition and detection are challenging tasks for resource-constrained embedded systems, commonly used in the industrial sector. To overcome these issues, we propose a dimensionality reduction framework based on Proper Orthogonal Decomposition, a classical model order reduction technique, in order to gain a reduction in the number of hyperparameters of the net. We have applied such framework to SSD300 architecture using PASCAL VOC dataset, demonstrating a reduction of the network dimension and a remarkable speedup in the fine-tuning of the network in a transfer learning context.

A Proper Orthogonal Decomposition approach for parameters reduction of Single Shot Detector networks

TL;DR

The paper tackles the challenge of deploying high-accuracy object detectors like SSD300 on resource-constrained devices by introducing a POD-based dimensionality reduction layer. It splits the base network at a chosen cut-off layer , projects the high-dimensional pre-model activations onto a low-dimensional POD subspace via , and connects this reduced representation to the original predictor, with priors reduced from to . Empirical results show that the reduced network achieves substantial gains in memory and training time (e.g., memory down ~15–22% and training time halved) but incurs notable accuracy loss (e.g., drops from to on VOC and from to on a cat-dog subset). The work highlights a practical trade-off between compression and detection performance and suggests avenues like hyperreduction and automatic cutoff-layer selection to further improve efficiency in real-world deployments.

Abstract

As a major breakthrough in artificial intelligence and deep learning, Convolutional Neural Networks have achieved an impressive success in solving many problems in several fields including computer vision and image processing. Real-time performance, robustness of algorithms and fast training processes remain open problems in these contexts. In addition object recognition and detection are challenging tasks for resource-constrained embedded systems, commonly used in the industrial sector. To overcome these issues, we propose a dimensionality reduction framework based on Proper Orthogonal Decomposition, a classical model order reduction technique, in order to gain a reduction in the number of hyperparameters of the net. We have applied such framework to SSD300 architecture using PASCAL VOC dataset, demonstrating a reduction of the network dimension and a remarkable speedup in the fine-tuning of the network in a transfer learning context.
Paper Structure (7 sections, 4 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 7 sections, 4 equations, 2 figures, 2 tables, 1 algorithm.

Figures (2)

  • Figure 1: Graphical representation of the reduction method proposed for an object detector.
  • Figure 2: Comparison of the results obtained using the original SSD300 and its reduced version on two test images.