Iterated Population Based Training with Task-Agnostic Restarts
Alexander Chebykin, Tanja Alderliesten, Peter A. N. Bosman
TL;DR
The paper tackles the challenge of efficient hyperparameter optimization for deep learning under tight budgets by introducing IPBT, which automatically adjusts the HP-update step size through restarts that reuse partially trained weights in a task-agnostic manner and reinitialize HPs via time-varying Bayesian optimization. This iterated PBT approach enables rapid early gains and eventual long-horizon optimization while preserving efficiency, making it suitable for black-box problems with limited budget. Across 8 image-classification and reinforcement-learning tasks, IPBT matches or surpasses prior PBT variants and popular HPO baselines without task-specific tuning, with ablations confirming the effectiveness of stagnation-based restarts, shrink-perturb weight reuse, and exponential step-size growth. The work demonstrates a robust, out-of-the-box method for discovering useful HP schedules and highlights directions for extending IPBT to multi-objective optimization and NAS in future work.
Abstract
Hyperparameter Optimization (HPO) can lift the burden of tuning hyperparameters (HPs) of neural networks. HPO algorithms from the Population Based Training (PBT) family are efficient thanks to dynamically adjusting HPs every few steps of the weight optimization. Recent results indicate that the number of steps between HP updates is an important meta-HP of all PBT variants that can substantially affect their performance. Yet, no method or intuition is available for efficiently setting its value. We introduce Iterated Population Based Training (IPBT), a novel PBT variant that automatically adjusts this HP via restarts that reuse weight information in a task-agnostic way and leverage time-varying Bayesian optimization to reinitialize HPs. Evaluation on 8 image classification and reinforcement learning tasks shows that, on average, our algorithm matches or outperforms 5 previous PBT variants and other HPO algorithms (random search, ASHA, SMAC3), without requiring a budget increase or any changes to its HPs. The source code is available at https://github.com/AwesomeLemon/IPBT.
