Table of Contents
Fetching ...

Vanishing Contributions: A Unified Approach to Smoothly Transition Neural Models into Compressed Form

Lorenzo Nikiforos, Charalampos Antoniadis, Luciano Prono, Fabio Pareschi, Riccardo Rovatti, Gianluca Setti

TL;DR

This work addresses the accuracy degradation often seen when compressing deep networks. It introduces Vanishing Contributions (VCON), a general training strategy that runs the original and compressed paths in parallel and gradually shifts weight from the original to the compressed path using a linearly decaying coefficient, specifically $ \bar g^{(i),t}_{\Theta,\tilde{\Theta}}(\cdot) = \beta^t f^{(i)}_{\Theta}(\cdot) + (1-\beta^t) g^{(i)}_{\tilde{\Theta}}(\cdot)$ with $\beta^t = \max(1-\frac{t}{Q},0)$. The approach is evaluated across three compression modalities—pruning, binary quantization, and low-rank decomposition—on computer vision and natural language processing benchmarks, yielding typical gains above 3% and occasional improvements up to 20%. Results show VCON consistently outperforms standard post-shot compression baselines and remains robust across granularities and tasks, underscoring its practicality for real-world deployment. Overall, VCON provides a general, lightweight extension to existing compression pipelines that improves stability, preserves accuracy, and can be readily integrated into diverse architectures.

Abstract

The increasing scale of deep neural networks has led to a growing need for compression techniques such as pruning, quantization, and low-rank decomposition. While these methods are very effective in reducing memory, computation and energy consumption, they often introduce severe accuracy degradation when applied directly. We introduce Vanishing Contributions (VCON), a general approach for smoothly transitioning neural models into compressed form. Rather than replacing the original network directly with its compressed version, VCON executes the two in parallel during fine-tuning. The contribution of the original (uncompressed) model is progressively reduced, while that of the compressed model is gradually increased. This smooth transition allows the network to adapt over time, improving stability and mitigating accuracy degradation. We evaluate VCON across computer vision and natural language processing benchmarks, in combination with multiple compression strategies. Across all scenarios, VCON leads to consistent improvements: typical gains exceed 3%, while some configuration exhibits accuracy boosts of 20%. VCON thus provides a generalizable method that can be applied to the existing compression techniques, with evidence of consistent gains across multiple benchmarks.

Vanishing Contributions: A Unified Approach to Smoothly Transition Neural Models into Compressed Form

TL;DR

This work addresses the accuracy degradation often seen when compressing deep networks. It introduces Vanishing Contributions (VCON), a general training strategy that runs the original and compressed paths in parallel and gradually shifts weight from the original to the compressed path using a linearly decaying coefficient, specifically with . The approach is evaluated across three compression modalities—pruning, binary quantization, and low-rank decomposition—on computer vision and natural language processing benchmarks, yielding typical gains above 3% and occasional improvements up to 20%. Results show VCON consistently outperforms standard post-shot compression baselines and remains robust across granularities and tasks, underscoring its practicality for real-world deployment. Overall, VCON provides a general, lightweight extension to existing compression pipelines that improves stability, preserves accuracy, and can be readily integrated into diverse architectures.

Abstract

The increasing scale of deep neural networks has led to a growing need for compression techniques such as pruning, quantization, and low-rank decomposition. While these methods are very effective in reducing memory, computation and energy consumption, they often introduce severe accuracy degradation when applied directly. We introduce Vanishing Contributions (VCON), a general approach for smoothly transitioning neural models into compressed form. Rather than replacing the original network directly with its compressed version, VCON executes the two in parallel during fine-tuning. The contribution of the original (uncompressed) model is progressively reduced, while that of the compressed model is gradually increased. This smooth transition allows the network to adapt over time, improving stability and mitigating accuracy degradation. We evaluate VCON across computer vision and natural language processing benchmarks, in combination with multiple compression strategies. Across all scenarios, VCON leads to consistent improvements: typical gains exceed 3%, while some configuration exhibits accuracy boosts of 20%. VCON thus provides a generalizable method that can be applied to the existing compression techniques, with evidence of consistent gains across multiple benchmarks.

Paper Structure

This paper contains 19 sections, 4 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Illustration of VCON: from left to right, the original layer (orange) slowly disappears while the compressed layer (green) contribution is gradually incorporated.
  • Figure 2: Illustration of block-wise VCON: the first two blocks $f_{\bm{\Theta}}^{(1)}$ and $f_{\bm{\Theta}}^{(2)}$ are progressively replaced with their compressed counterparts $g_{\tilde{\bm \Theta}}^{(i)}$, while the final block $f_{\bm{\Theta}}^{(3)}$ remains uncompressed.
  • Figure 3: Visual intuition of the VCON approach: when a model parameter is removed abruptly (a), the working point is projected directly onto the hyperplane defined by the remaining dimensions, which is suboptimal. In contrast, if the parameter is removed gradually (b), the working point shifts slowly toward the hyperplane of the remaining dimensions and the model is continuously updated, allowing a greater chance to reach a better local minimum.
  • Figure 4: Impact of Training Dynamics with VCON: evolution during training for different transition durations $Q$.