Dynamic DropConnect: Enhancing Neural Network Robustness through Adaptive Edge Dropping Strategies
Yuan-Chih Yang, Hung-Hsuan Chen
TL;DR
DynamicDropConnect (DDC) tackles overfitting by replacing fixed-edge dropping with per-edge drop probabilities derived from gradient magnitudes. The method generates a gradient-informed mask using layer-wise normalization and a candidate drop probability $q_{i,j}^{(l)}$, combining it with a base rate $p$ and gradient-rate $p_g$ to produce $p_{i,j}^{(l)}$, then calibrates training outputs to ensure inference uses the original weights. Across synthetic data and multiple open datasets (MNIST, CIFAR-10/100, NORB) and architectures (SimpleCNN, AlexNet, VGG), DDC consistently outperforms Dropout, DropConnect, and Standout, with higher accuracy and lower variance. This parameter-free approach provides a robust, scalable regularization mechanism and suggests promising theoretical avenues linking gradient-driven dropping to Bayesian perspectives. The work includes public code to enable replication and further exploration of gradient-based edge dropping.
Abstract
Dropout and DropConnect are well-known techniques that apply a consistent drop rate to randomly deactivate neurons or edges in a neural network layer during training. This paper introduces a novel methodology that assigns dynamic drop rates to each edge within a layer, uniquely tailoring the dropping process without incorporating additional learning parameters. We perform experiments on synthetic and openly available datasets to validate the effectiveness of our approach. The results demonstrate that our method outperforms Dropout, DropConnect, and Standout, a classic mechanism known for its adaptive dropout capabilities. Furthermore, our approach improves the robustness and generalization of neural network training without increasing computational complexity. The complete implementation of our methodology is publicly accessible for research and replication purposes at https://github.com/ericabd888/Adjusting-the-drop-probability-in-DropConnect-based-on-the-magnitude-of-the-gradient/.
