Expanding-and-Shrinking Binary Neural Networks
Xulong Shi, Caiyi Sun, Zhi Qi, Liu Hao, Xiaodong Yang
TL;DR
Binary neural networks offer substantial speed and memory benefits but suffer accuracy gaps due to limited per-layer representation capacity. The authors address this by introducing the Expanding-and-Shrinking (ES) operation combined with Binary Group Convolution (G), boosting capacity without increasing compute and extending to Transformers by FFN augmentation with minimal change to attention cost. Key contributions include formalizing representation capacity, detailing ES and G components, applying them to CNNs and Transformers, and showing consistent gains across image classification, object detection, and diffusion-model SR with negligible overhead. The approach enables more accurate, deployment-friendly BNNs across diverse tasks, with a public code release.
Abstract
While binary neural networks (BNNs) offer significant benefits in terms of speed, memory and energy, they encounter substantial accuracy degradation in challenging tasks compared to their real-valued counterparts. Due to the binarization of weights and activations, the possible values of each entry in the feature maps generated by BNNs are strongly constrained. To tackle this limitation, we propose the expanding-and-shrinking operation, which enhances binary feature maps with negligible increase of computation complexity, thereby strengthening the representation capacity. Extensive experiments conducted on multiple benchmarks reveal that our approach generalizes well across diverse applications ranging from image classification, object detection to generative diffusion model, while also achieving remarkable improvement over various leading binarization algorithms based on different architectures including both CNNs and Transformers.
