Batch Normalization-Free Fully Integer Quantized Neural Networks via Progressive Tandem Learning
Pengfei Sun, Wenyu Jiang, Piew Yoong Chee, Paul Devos, Dick Botteldooren
TL;DR
The paper tackles the challenge of deploying quantized neural networks on edge hardware by removing Batch Normalization from fully integer networks. It introduces a Fully-Integer Quantized Neural Network (FIQNN) trained via progressive tandem learning, using layer-wise distillation and per-layer scale factors to mimic BN's stabilizing effect without any BN operations. Empirical results on ImageNet (AlexNet) and CIFAR-10 show that the BN-free student achieves competitive Top-1/Top-5 accuracy under aggressive quantization, with only modest degradation compared to BN-enabled teachers. The approach is compatible with existing quantization workflows and enables end-to-end integer inference, offering significant benefits for resource-constrained devices.
Abstract
Quantised neural networks (QNNs) shrink models and reduce inference energy through low-bit arithmetic, yet most still depend on a running statistics batch normalisation (BN) layer, preventing true integer-only deployment. Prior attempts remove BN by parameter folding or tailored initialisation; while helpful, they rarely recover BN's stability and accuracy and often impose bespoke constraints. We present a BN-free, fully integer QNN trained via a progressive, layer-wise distillation scheme that slots into existing low-bit pipelines. Starting from a pretrained BN-enabled teacher, we use layer-wise targets and progressive compensation to train a student that performs inference exclusively with integer arithmetic and contains no BN operations. On ImageNet with AlexNet, the BN-free model attains competitive Top-1 accuracy under aggressive quantisation. The procedure integrates directly with standard quantisation workflows, enabling end-to-end integer-only inference for resource-constrained settings such as edge and embedded devices.
