BOHB: Robust and Efficient Hyperparameter Optimization at Scale

Stefan Falkner; Aaron Klein; Frank Hutter

BOHB: Robust and Efficient Hyperparameter Optimization at Scale

Stefan Falkner, Aaron Klein, Frank Hutter

TL;DR

BOHB addresses the scalability challenge of hyperparameter optimization for large-scale ML by fusing Hyperband's resource-aware scheduling with a KDE-based Bayesian optimization component. It operates over budgets $b$ in a range $[b_{min}, b_{max}]$ and uses a single multidimensional KDE to guide configuration search, complemented by an exploration mechanism and random exploration to preserve diversity. Empirically, BOHB delivers strong anytime performance and rapid convergence to near-optimal configurations across diverse domains (SVMs, FFNNs, Bayesian NNs, RL, and CIFAR-10 CNNs), outperforming standalone Bayesian optimization and Hyperband in many settings and scaling well with parallel resources. The method is simple to implement and practical, with open-source code available for broader deployment and future budget-adaptation enhancements.

Abstract

Modern deep learning methods are very sensitive to many hyperparameters, and, due to the long training times of state-of-the-art models, vanilla Bayesian hyperparameter optimization is typically computationally infeasible. On the other hand, bandit-based configuration evaluation approaches based on random search lack guidance and do not converge to the best configurations as quickly. Here, we propose to combine the benefits of both Bayesian optimization and bandit-based methods, in order to achieve the best of both worlds: strong anytime performance and fast convergence to optimal configurations. We propose a new practical state-of-the-art hyperparameter optimization method, which consistently outperforms both Bayesian optimization and Hyperband on a wide range of problem types, including high-dimensional toy functions, support vector machines, feed-forward neural networks, Bayesian neural networks, deep reinforcement learning, and convolutional neural networks. Our method is robust and versatile, while at the same time being conceptually simple and easy to implement.

BOHB: Robust and Efficient Hyperparameter Optimization at Scale

TL;DR

Abstract

BOHB: Robust and Efficient Hyperparameter Optimization at Scale

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)