Grouped Sequential Optimization Strategy -- the Application of Hyperparameter Importance Assessment in Deep Learning
Ruinan Wang, Ian Nabney, Mohammad Golbabaee
TL;DR
This work tackles the high computational cost of hyperparameter optimization in deep learning by leveraging Hyperparameter Importance Assessment (HIA) to guide a Grouped Sequential Optimization Strategy (GSOS). Using Tree-structured Parzen Estimator (TPE) based Bayesian optimization, GSOS sequentially optimizes hyperparameter groups ordered by importance, updating the configuration after each group to inform subsequent searches. Empirical results on six image-classification datasets show that GSOS reduces time to identify optimal hyperparameters by $19.69\%$ and total optimization time by $31.90\%$, with a modest average drop in validation and test accuracy of $2.23\%$ and $0.44\%$, respectively. The findings suggest that incorporating HIA into HPO can provide meaningful efficiency gains for time-constrained deployments and AutoML workflows, with future work extending GSOS to other architectures and optimization frameworks.
Abstract
Hyperparameter optimization (HPO) is a critical component of machine learning pipelines, significantly affecting model robustness, stability, and generalization. However, HPO is often a time-consuming and computationally intensive task. Traditional HPO methods, such as grid search and random search, often suffer from inefficiency. Bayesian optimization, while more efficient, still struggles with high-dimensional search spaces. In this paper, we contribute to the field by exploring how insights gained from hyperparameter importance assessment (HIA) can be leveraged to accelerate HPO, reducing both time and computational resources. Building on prior work that quantified hyperparameter importance by evaluating 10 hyperparameters on CNNs using 10 common image classification datasets, we implement a novel HPO strategy called 'Sequential Grouping.' That prior work assessed the importance weights of the investigated hyperparameters based on their influence on model performance, providing valuable insights that we leverage to optimize our HPO process. Our experiments, validated across six additional image classification datasets, demonstrate that incorporating hyperparameter importance assessment (HIA) can significantly accelerate HPO without compromising model performance, reducing optimization time by an average of 31.9\% compared to the conventional simultaneous strategy.
