94% on CIFAR-10 in 3.29 Seconds on a Single GPU
Keller Jordan
TL;DR
The paper tackles the problem of accelerating CIFAR-10 training under fixed hardware by introducing airbench, a suite of methods including patch-whitening and partial identity initializations, Lookahead optimization, scaled BN biases, alternating flip augmentation, and multi-crop evaluation, with optional Torch compilation. It demonstrates unprecedented speedups, achieving approximately $0.94$ accuracy in $3.29$ seconds on a single NVIDIA A100, and targets of $0.95$ in $10.4$ seconds and $0.96$ in $46.3$ seconds, while releasing the code for reproducibility. Key contributions include the derandomized alternating flip that reduces redundancy, additive speedup interactions among features, and extensive experiments showing generalization to CIFAR-100 and ImageNet settings under certain crop strategies. The practical impact is substantial for rapid hyperparameter studies and large-scale training studies, enabling faster statistical significance assessments and reduced computational cost.
Abstract
CIFAR-10 is among the most widely used datasets in machine learning, facilitating thousands of research projects per year. To accelerate research and reduce the cost of experiments, we introduce training methods for CIFAR-10 which reach 94% accuracy in 3.29 seconds, 95% in 10.4 seconds, and 96% in 46.3 seconds, when run on a single NVIDIA A100 GPU. As one factor contributing to these training speeds, we propose a derandomized variant of horizontal flipping augmentation, which we show improves over the standard method in every case where flipping is beneficial over no flipping at all. Our code is released at https://github.com/KellerJordan/cifar10-airbench.
