Table of Contents
Fetching ...

FlexHB: a More Efficient and Flexible Framework for Hyperparameter Optimization

Yang Zhang, Haiyang Wu, Yuekui Yang

TL;DR

FlexHB advances hyperparameter optimization by integrating a fine-grained multi-fidelity strategy with a redesigned Halving framework and global ranking across history. It introduces Fine-Grained Fidelity to collect richer intermediate data, GloSH to reuse past evaluations and revive promising configurations, and FlexBand to adaptively allocate SH brackets based on ranking correlations. Empirical results across neural networks and ML tasks show substantial speedups over state-of-the-art methods and improved anytime performance. The work highlights practical benefits for AutoML pipelines, enabling faster and more flexible exploration of complex hyperparameter spaces.

Abstract

Given a Hyperparameter Optimization(HPO) problem, how to design an algorithm to find optimal configurations efficiently? Bayesian Optimization(BO) and the multi-fidelity BO methods employ surrogate models to sample configurations based on history evaluations. More recent studies obtain better performance by integrating BO with HyperBand(HB), which accelerates evaluation by early stopping mechanism. However, these methods ignore the advantage of a suitable evaluation scheme over the default HyperBand, and the capability of BO is still constrained by skewed evaluation results. In this paper, we propose FlexHB, a new method pushing multi-fidelity BO to the limit as well as re-designing a framework for early stopping with Successive Halving(SH). Comprehensive study on FlexHB shows that (1) our fine-grained fidelity method considerably enhances the efficiency of searching optimal configurations, (2) our FlexBand framework (self-adaptive allocation of SH brackets, and global ranking of configurations in both current and past SH procedures) grants the algorithm with more flexibility and improves the anytime performance. Our method achieves superior efficiency and outperforms other methods on various HPO tasks. Empirical results demonstrate that FlexHB can achieve up to 6.9X and 11.1X speedups over the state-of-the-art MFES-HB and BOHB respectively.

FlexHB: a More Efficient and Flexible Framework for Hyperparameter Optimization

TL;DR

FlexHB advances hyperparameter optimization by integrating a fine-grained multi-fidelity strategy with a redesigned Halving framework and global ranking across history. It introduces Fine-Grained Fidelity to collect richer intermediate data, GloSH to reuse past evaluations and revive promising configurations, and FlexBand to adaptively allocate SH brackets based on ranking correlations. Empirical results across neural networks and ML tasks show substantial speedups over state-of-the-art methods and improved anytime performance. The work highlights practical benefits for AutoML pipelines, enabling faster and more flexible exploration of complex hyperparameter spaces.

Abstract

Given a Hyperparameter Optimization(HPO) problem, how to design an algorithm to find optimal configurations efficiently? Bayesian Optimization(BO) and the multi-fidelity BO methods employ surrogate models to sample configurations based on history evaluations. More recent studies obtain better performance by integrating BO with HyperBand(HB), which accelerates evaluation by early stopping mechanism. However, these methods ignore the advantage of a suitable evaluation scheme over the default HyperBand, and the capability of BO is still constrained by skewed evaluation results. In this paper, we propose FlexHB, a new method pushing multi-fidelity BO to the limit as well as re-designing a framework for early stopping with Successive Halving(SH). Comprehensive study on FlexHB shows that (1) our fine-grained fidelity method considerably enhances the efficiency of searching optimal configurations, (2) our FlexBand framework (self-adaptive allocation of SH brackets, and global ranking of configurations in both current and past SH procedures) grants the algorithm with more flexibility and improves the anytime performance. Our method achieves superior efficiency and outperforms other methods on various HPO tasks. Empirical results demonstrate that FlexHB can achieve up to 6.9X and 11.1X speedups over the state-of-the-art MFES-HB and BOHB respectively.
Paper Structure (29 sections, 7 equations, 13 figures, 9 tables, 6 algorithms)

This paper contains 29 sections, 7 equations, 13 figures, 9 tables, 6 algorithms.

Figures (13)

  • Figure 1: Measurement distribution over fidelity levels in a bracket. Darker color for higher fidelity, $x$-axis for number of measurements, $y$-axis for resources (different fidelities).
  • Figure 2: A 2D hyperparameter space for tuning, $x$-axis for values of first hyperparameter and $y$-axis for another. Cooler color for lower validation error points (better configurations), warmer color for higher validation error points. Configurations terminated before reaching the full resource are marked in black "X" in Figure \ref{['fig:Main Glosh']}e. Details can be found in Appendix C.3.
  • Figure 3: Results for tuning on four main HPO tasks. LSTM task uses the validation perplexity as metrics.
  • Figure 4: Updating weights in MFES and FlexHB, $x$-axis for iterations and $y$-axis for weight values. Experiments are conducted on the MLP task for 10 times to obtain the stable mean value.
  • Figure 5: Early stopped configurations in 2D hyperparameter space. Early stopped ones are marked in black "X".
  • ...and 8 more figures