Table of Contents
Fetching ...

Application-Specific Component-Aware Structured Pruning of Deep Neural Networks in Control via Soft Coefficient Optimization

Ganesh Sundaram, Jonas Ulmen, Amjad Haider, Daniel Görges

TL;DR

This paper addresses application-specific pruning of neural networks used as controllers, where standard compression methods risk degrading task-critical behaviors. It proposes an application-aware, component-aware structured pruning framework that assigns learnable soft pruning coefficients to groups of parameters and optimizes them to meet a target sparsity $\rho$ within tolerance $\varepsilon$ via grid search or a constrained gradient-based method. The approach relies on a dependency-graph representation to define pruning groups and uses task-specific metrics (e.g., PSNR for MNIST autoencoders, episode reward for TD-MPC) to guide pruning. Experiments on a MNIST autoencoder and a TD-MPC agent show that gradient-based coefficient optimization yields higher task performance at fixed sparsity than traditional magnitude-based pruning, with substantial reductions in search time and improved stability of latent-space representations. The results highlight a practical path to deploy compact, control-oriented DNNs while preserving critical application features.

Abstract

Deep neural networks (DNNs) offer significant flexibility and robust performance. This makes them ideal for building not only system models but also advanced neural network controllers (NNCs). However, their high complexity and computational needs often limit their use. Various model compression strategies have been developed over the past few decades to address these issues. These strategies are effective for general DNNs but do not directly apply to NNCs. NNCs need both size reduction and the retention of key application-specific performance features. In structured pruning, which removes groups of related elements, standard importance metrics often fail to protect these critical characteristics. In this paper, we introduce a novel framework for calculating importance metrics in pruning groups. This framework not only shrinks the model size but also considers various application-specific constraints. To find the best pruning coefficient for each group, we evaluate two approaches. The first approach involves simple exploration through grid search. The second utilizes gradient descent optimization, aiming to balance compression and task performance. We test our method in two use cases: one on an MNIST autoencoder and the other on a Temporal Difference Model Predictive Control (TDMPC) agent. Results show that the method effectively maintains application-relevant performance while achieving a significant reduction in model size.

Application-Specific Component-Aware Structured Pruning of Deep Neural Networks in Control via Soft Coefficient Optimization

TL;DR

This paper addresses application-specific pruning of neural networks used as controllers, where standard compression methods risk degrading task-critical behaviors. It proposes an application-aware, component-aware structured pruning framework that assigns learnable soft pruning coefficients to groups of parameters and optimizes them to meet a target sparsity within tolerance via grid search or a constrained gradient-based method. The approach relies on a dependency-graph representation to define pruning groups and uses task-specific metrics (e.g., PSNR for MNIST autoencoders, episode reward for TD-MPC) to guide pruning. Experiments on a MNIST autoencoder and a TD-MPC agent show that gradient-based coefficient optimization yields higher task performance at fixed sparsity than traditional magnitude-based pruning, with substantial reductions in search time and improved stability of latent-space representations. The results highlight a practical path to deploy compact, control-oriented DNNs while preserving critical application features.

Abstract

Deep neural networks (DNNs) offer significant flexibility and robust performance. This makes them ideal for building not only system models but also advanced neural network controllers (NNCs). However, their high complexity and computational needs often limit their use. Various model compression strategies have been developed over the past few decades to address these issues. These strategies are effective for general DNNs but do not directly apply to NNCs. NNCs need both size reduction and the retention of key application-specific performance features. In structured pruning, which removes groups of related elements, standard importance metrics often fail to protect these critical characteristics. In this paper, we introduce a novel framework for calculating importance metrics in pruning groups. This framework not only shrinks the model size but also considers various application-specific constraints. To find the best pruning coefficient for each group, we evaluate two approaches. The first approach involves simple exploration through grid search. The second utilizes gradient descent optimization, aiming to balance compression and task performance. We test our method in two use cases: one on an MNIST autoencoder and the other on a Temporal Difference Model Predictive Control (TDMPC) agent. Results show that the method effectively maintains application-relevant performance while achieving a significant reduction in model size.

Paper Structure

This paper contains 21 sections, 1 equation, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Reconstruction quality of the baseline (unpruned) model. The top row displays the original input images from the test set, and the bottom row shows their corresponding reconstructions.
  • Figure 2: Comparison of reconstruction quality for a model pruned to 20% sparsity using random (top) and norm-based (bottom) coefficient selection with PyTorch’s standard pruning library.
  • Figure 3: Reconstruction quality comparison for models pruned to 20% sparsity using (a) grid search and (b) gradient descent optimization.
  • Figure 4: Performance degradation of TD-MPC under PyTorch structured pruning using magnitude-based importance. The red dashed line indicates 20% pruning, at which point baseline performance declines to 604.7, demonstrating severe degradation.