Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules
Daniel Ho, Eric Liang, Ion Stoica, Pieter Abbeel, Xi Chen
TL;DR
The paper tackles efficient discovery of data augmentation policies by learning augmentation schedules rather than fixed policies. It introduces Population Based Augmentation (PBA), which uses Population Based Training to jointly optimize augmentation parameters and training progress, producing a schedule that adapts across epochs. PBA achieves competitive results with AutoAugment on CIFAR-10/100 and SVHN while reducing compute by orders of magnitude, and ablations demonstrate that the schedule, not just the policy, drives gains. The approach is simple to implement within a PBT framework and is released as open source for broader adoption.
Abstract
A key challenge in leveraging data augmentation for neural network training is choosing an effective augmentation policy from a large search space of candidate operations. Properly chosen augmentation policies can lead to significant generalization improvements; however, state-of-the-art approaches such as AutoAugment are computationally infeasible to run for the ordinary user. In this paper, we introduce a new data augmentation algorithm, Population Based Augmentation (PBA), which generates nonstationary augmentation policy schedules instead of a fixed augmentation policy. We show that PBA can match the performance of AutoAugment on CIFAR-10, CIFAR-100, and SVHN, with three orders of magnitude less overall compute. On CIFAR-10 we achieve a mean test error of 1.46%, which is a slight improvement upon the current state-of-the-art. The code for PBA is open source and is available at https://github.com/arcelien/pba.
