Stochastic Subsampling With Average Pooling
Bum Jun Kim, Sang Woo Kim
TL;DR
The paper tackles overfitting and regularization in deep networks by addressing the instability Dropout introduces when used with batch normalization. It proposes stochastic average pooling (SAP), a module that combines stochastic subsampling with average pooling and applies a $\sqrt{p}$ scaling so that training-time subnetworks align with a test-time ensemble, while keeping the output size fixed. SAP has 1D and 2D formulations and can be dropped into existing architectures as a direct GAP replacement. Empirical results across image classification, semantic segmentation, and object detection demonstrate consistent improvements with SAP, particularly at moderate keep probabilities (e.g., $p\approx0.5$), indicating broad applicability and practical impact. The work also analyzes subsampling patterns, finding that channel-shared randomness without strong spatial constraints offers the most robust regularization benefits.
Abstract
Regularization of deep neural networks has been an important issue to achieve higher generalization performance without overfitting problems. Although the popular method of Dropout provides a regularization effect, it causes inconsistent properties in the output, which may degrade the performance of deep neural networks. In this study, we propose a new module called stochastic average pooling, which incorporates Dropout-like stochasticity in pooling. We describe the properties of stochastic subsampling and average pooling and leverage them to design a module without any inconsistency problem. The stochastic average pooling achieves a regularization effect without any potential performance degradation due to the inconsistency issue and can easily be plugged into existing architectures of deep neural networks. Experiments demonstrate that replacing existing average pooling with stochastic average pooling yields consistent improvements across a variety of tasks, datasets, and models.
