Synaptic Pruning: A Biological Inspiration for Deep Learning Regularization
Gideon Vos, Liza van Eijk, Zoltan Sarnyai, Mostafa Rahimi Azghadi
TL;DR
This paper tackles the inefficiency and static nature of conventional dropout by proposing a biology-inspired, magnitude-based synaptic pruning method that progressively eliminates low-importance connections during training. The approach integrates permanent pruning masks into the training loop with a cubic sparsity schedule and global weight ranking, enabling dynamic adaptation across RNN, LSTM, and PatchTST architectures for time-series forecasting. Across four diverse datasets, the method yields consistent MAE improvements, with significant gains in several configurations (up to 52% in some transformers) and modest overhead, demonstrating its practicality as a regularization and compression technique. The work highlights the potential of activity-dependent pruning to enhance generalization and efficiency, particularly in financial time-series applications, and points to future work in scalability and broader architecture validation.
Abstract
Synaptic pruning in biological brains removes weak connections to improve efficiency. In contrast, dropout regularization in artificial neural networks randomly deactivates neurons without considering activity-dependent pruning. We propose a magnitude-based synaptic pruning method that better reflects biology by progressively removing low-importance connections during training. Integrated directly into the training loop as a dropout replacement, our approach computes weight importance from absolute magnitudes across layers and applies a cubic schedule to gradually increase global sparsity. At fixed intervals, pruning masks permanently remove low-importance weights while maintaining gradient flow for active ones, eliminating the need for separate pruning and fine-tuning phases. Experiments on multiple time series forecasting models including RNN, LSTM, and Patch Time Series Transformer across four datasets show consistent gains. Our method ranked best overall, with statistically significant improvements confirmed by Friedman tests (p < 0.01). In financial forecasting, it reduced Mean Absolute Error by up to 20% over models with no or standard dropout, and up to 52% in select transformer models. This dynamic pruning mechanism advances regularization by coupling weight elimination with progressive sparsification, offering easy integration into diverse architectures. Its strong performance, especially in financial time series forecasting, highlights its potential as a practical alternative to conventional dropout techniques.
