Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning
Qianli Liao, Kenji Kawaguchi, Tomaso Poggio
TL;DR
This work addresses Batch Normalization's limitations for online and recurrent learning by introducing Streaming Normalization, a unifying framework that collects normalization statistics online and uses streaming gradients to update parameters without requiring full backpropagation over history. It generalizes LN and BN through Sample Normalization, General Batch Normalization, and Streaming Normalization, and strengthens online suitability with DAU and Lp normalization (notably L1). The approach is extended to recurrent networks via RGBN and RSN, and demonstrated to achieve faster convergence and robust performance across CIFAR-10 and character-level language modeling, with strong implications for hardware efficiency and biological plausibility. Overall, Streaming Normalization offers a flexible, scalable normalization paradigm that outperforms traditional methods in online, small-batch, and recurrent settings, while aligning more closely with biological processing principles.
Abstract
We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online learning and (2) recurrent learning. Our proposal is simpler and more biologically-plausible. Unlike previous approaches, our technique can be applied out of the box to all learning scenarios (e.g., online learning, batch learning, fully-connected, convolutional, feedforward, recurrent and mixed --- recurrent and convolutional) and compare favorably with existing approaches. We also propose Lp Normalization for normalizing by different orders of statistical moments. In particular, L1 normalization is well-performing, simple to implement, fast to compute, more biologically-plausible and thus ideal for GPU or hardware implementations.
