ZNorm: Z-Score Gradient Normalization Accelerating Skip-Connected Network Training without Architectural Modification
Juyoung Yun
TL;DR
ZNorm introduces a gradient-centric normalization that standardizes per-layer gradient statistics to stabilize training in deep skip-connected networks without changing architecture. By applying a Z-score normalization to gradients, it maintains consistent gradient flow and can be integrated with Adam with minimal modification. Empirical results show improved accuracy on CIFAR-10 and PatchCamelyon, and enhanced segmentation metrics on LGG MRI data, highlighting robustness across classification and medical imaging tasks. Its simplicity, optimizer-agnostic design, and strong performance in residual and U-Net based architectures suggest substantial practical impact for efficient training of deep networks.
Abstract
The rapid advancements in deep learning necessitate better training methods for deep neural networks (DNNs). As models grow in complexity, vanishing and exploding gradients impede performance, particularly in skip-connected architectures like Deep Residual Networks. We propose Z-Score Normalization for Gradient Descent (ZNorm), an innovative technique that adjusts only the gradients without modifying the network architecture to accelerate training and improve model performance. ZNorm normalizes the overall gradients, providing consistent gradient scaling across layers, effectively reducing the risks of vanishing and exploding gradients and achieving superior performance. Extensive experiments on CIFAR-10 and medical datasets confirm that ZNorm consistently outperforms existing methods under the same experimental settings. In medical imaging applications, ZNorm significantly enhances tumor prediction and segmentation accuracy, underscoring its practical utility. These findings highlight ZNorm's potential as a robust and versatile tool for enhancing the training and effectiveness of deep neural networks, especially in skip-connected architectures, across various applications.
