Table of Contents
Fetching ...

A Meta-Level Learning Algorithm for Sequential Hyper-Parameter Space Reduction in AutoML

Giorgos Borboudakis, Paulos Charonyktakis, Konstantinos Paraschakis, Ioannis Tsamardinos

TL;DR

This work tackles the cash-like problem in AutoML by introducing Sequential Hyper-parameter Space Reduction (SHSR), a meta-level learning algorithm that uses past runs to prune discrete hyper-parameter choices. SHSR builds predictive models from dataset meta-features to identify and discard configuration groups that cannot significantly improve performance, recursively reducing the search space while keeping predictive accuracy within a user-defined tolerance. In extensive experiments across 659 datasets (284 classification, 375 regression), SHSR achieves up to about 30% reductions in execution time with less than 0.1% drop in predictive performance, and it remains effective even with incomplete data. The approach is interpretable, compatible with other HPO methods, and offers a practical path to speeding AutoML without sacrificing reliability.

Abstract

AutoML platforms have numerous options for the algorithms to try for each step of the analysis, i.e., different possible algorithms for imputation, transformations, feature selection, and modelling. Finding the optimal combination of algorithms and hyper-parameter values is computationally expensive, as the number of combinations to explore leads to an exponential explosion of the space. In this paper, we present the Sequential Hyper-parameter Space Reduction (SHSR) algorithm that reduces the space for an AutoML tool with negligible drop in its predictive performance. SHSR is a meta-level learning algorithm that analyzes past runs of an AutoML tool on several datasets and learns which hyper-parameter values to filter out from consideration on a new dataset to analyze. SHSR is evaluated on 284 classification and 375 regression problems, showing an approximate 30% reduction in execution time with a performance drop of less than 0.1%.

A Meta-Level Learning Algorithm for Sequential Hyper-Parameter Space Reduction in AutoML

TL;DR

This work tackles the cash-like problem in AutoML by introducing Sequential Hyper-parameter Space Reduction (SHSR), a meta-level learning algorithm that uses past runs to prune discrete hyper-parameter choices. SHSR builds predictive models from dataset meta-features to identify and discard configuration groups that cannot significantly improve performance, recursively reducing the search space while keeping predictive accuracy within a user-defined tolerance. In extensive experiments across 659 datasets (284 classification, 375 regression), SHSR achieves up to about 30% reductions in execution time with less than 0.1% drop in predictive performance, and it remains effective even with incomplete data. The approach is interpretable, compatible with other HPO methods, and offers a practical path to speeding AutoML without sacrificing reliability.

Abstract

AutoML platforms have numerous options for the algorithms to try for each step of the analysis, i.e., different possible algorithms for imputation, transformations, feature selection, and modelling. Finding the optimal combination of algorithms and hyper-parameter values is computationally expensive, as the number of combinations to explore leads to an exponential explosion of the space. In this paper, we present the Sequential Hyper-parameter Space Reduction (SHSR) algorithm that reduces the space for an AutoML tool with negligible drop in its predictive performance. SHSR is a meta-level learning algorithm that analyzes past runs of an AutoML tool on several datasets and learns which hyper-parameter values to filter out from consideration on a new dataset to analyze. SHSR is evaluated on 284 classification and 375 regression problems, showing an approximate 30% reduction in execution time with a performance drop of less than 0.1%.
Paper Structure (18 sections, 4 figures, 1 table, 1 algorithm)

This paper contains 18 sections, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Sample size vs feature size for regression and classification datasets. The x-axis shows the sample size, while the y-axis shows the feature size. For the classification datasets, the color intensity varies depending on the class distribution. Both axes are in $\log_{10}$ scale.
  • Figure 2: Effect of threshold $T$ on predictive performance and execution time. The x-axis shows the threshold, and the y-axis shows the ratio between the predictive performance (execution time) using the configurations returned by SHSR, relative to using all configurations. Error bars show the 95% Gaussian confidence intervals for the mean, resulting from 20 runs of the experiment. We observe that SHSR leads to a significant reduction in execution time, with minimal drop in predictive performance.
  • Figure 3: Effect of running SHSR with partial results on predictive performance and execution time. The x-axis shows the proportion of used results, and the y-axis shows the ratio between the predictive performance (execution time) using the configurations returned by SHSR, relative to using all configurations. Error bars show the 95% Gaussian confidence intervals for the mean, resulting from 20 runs of the experiment. We observe that SHSR is able to perform well even with partial results, and that it performs better the more results are available, as expected.
  • Figure 4: Comparison between SHSR, KNN based algorithm, and random elimination. The x-axis (y-axis) shows the ratio of execution time (performance) using the configurations returned by the algorithms, relative to using all configurations. Dotted lines show the 95% Gaussian confidence intervals for the mean, resulting from 20 runs of the experiment. In the case of the KNN based algorithm splines were used to smooth both the mean and the CI lines. All three algorithms perform similarly when few configurations are dropped, with SHSR being superior when many configurations are removed.