Table of Contents
Fetching ...

Rethinking of Encoder-based Warm-start Methods in Hyperparameter Optimization

Dawid Płudowski, Antoni Zajko, Anna Kozak, Katarzyna Woźnica

TL;DR

This work tackles the problem of representing heterogeneous tabular datasets for meta-learning, focusing on warm-starting Bayesian Hyperparameter Optimization. It evaluates encoder-based representations, namely Dataset2Vec and a novel liltab-based encoder inspired by few-shot learning, on OpenML, UCI, and metaMIMIC datasets, to determine their utility in transferring hyperparameter configurations. Despite encoders producing meaningful dataset clustering, the study finds no consistent gain over simple baselines (including rank-based warm-start) for HP optimization, suggesting that general-purpose representations may not suffice for all meta-tasks. The findings motivate the development of task-aware representations and more effective heuristics for meta-learning in hyperparameter optimization, with potential impacts on speeding up Bayesian optimization in heterogeneous tabular settings.

Abstract

Effectively representing heterogeneous tabular datasets for meta-learning purposes remains an open problem. Previous approaches rely on predefined meta-features, for example, statistical measures or landmarkers. The emergence of dataset encoders opens new possibilities for the extraction of meta-features because they do not involve any handmade design. Moreover, they are proven to generate dataset representations with desired spatial properties. In this research, we evaluate an encoder-based approach to one of the most established meta-tasks - warm-starting of the Bayesian Hyperparameter Optimization. To broaden our analysis we introduce a new approach for representation learning on tabular data based on [Tomoharu Iwata and Atsutoshi Kumagai. Meta-learning from Tasks with Heterogeneous Attribute Spaces. In Advances in Neural Information Processing Systems, 2020]. The validation on over 100 datasets from UCI and an independent metaMIMIC set of datasets highlights the nuanced challenges in representation learning. We show that general representations may not suffice for some meta-tasks where requirements are not explicitly considered during extraction.

Rethinking of Encoder-based Warm-start Methods in Hyperparameter Optimization

TL;DR

This work tackles the problem of representing heterogeneous tabular datasets for meta-learning, focusing on warm-starting Bayesian Hyperparameter Optimization. It evaluates encoder-based representations, namely Dataset2Vec and a novel liltab-based encoder inspired by few-shot learning, on OpenML, UCI, and metaMIMIC datasets, to determine their utility in transferring hyperparameter configurations. Despite encoders producing meaningful dataset clustering, the study finds no consistent gain over simple baselines (including rank-based warm-start) for HP optimization, suggesting that general-purpose representations may not suffice for all meta-tasks. The findings motivate the development of task-aware representations and more effective heuristics for meta-learning in hyperparameter optimization, with potential impacts on speeding up Bayesian optimization in heterogeneous tabular settings.

Abstract

Effectively representing heterogeneous tabular datasets for meta-learning purposes remains an open problem. Previous approaches rely on predefined meta-features, for example, statistical measures or landmarkers. The emergence of dataset encoders opens new possibilities for the extraction of meta-features because they do not involve any handmade design. Moreover, they are proven to generate dataset representations with desired spatial properties. In this research, we evaluate an encoder-based approach to one of the most established meta-tasks - warm-starting of the Bayesian Hyperparameter Optimization. To broaden our analysis we introduce a new approach for representation learning on tabular data based on [Tomoharu Iwata and Atsutoshi Kumagai. Meta-learning from Tasks with Heterogeneous Attribute Spaces. In Advances in Neural Information Processing Systems, 2020]. The validation on over 100 datasets from UCI and an independent metaMIMIC set of datasets highlights the nuanced challenges in representation learning. We show that general representations may not suffice for some meta-tasks where requirements are not explicitly considered during extraction.
Paper Structure (21 sections, 3 equations, 8 figures, 5 tables)

This paper contains 21 sections, 3 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: The workflow of the performed experiment. We denote datasets using dark blue color, their representations using light blue, and all other components using grey.
  • Figure 2: T-SNE visualization of three encoded datasets' representations.
  • Figure 3: Results for the elastic net on metaMIMIC. Each fold in ADTM plots on Figure \ref{['img:mimic-adtm']} presents results for a specific split of datasets to train and validate the subset. Here, the rank method results in the lowest average distance to the maximum ROC-AUC score for specific datasets. The position on the Critical distance on Figures \ref{['img:mimic-cd-10']} and \ref{['img:mimic-cd-30']} scale denotes test statistic in the Friedman test. Methods that are connected with horizontal lines are statistically indistinguishable.
  • Figure 4: First step of the inference network. Here, the inference network learns about the empirical marginal distributions of the attributes based on the support set.
  • Figure 5: Second step of the inference network. Here, the inference network learns the relationships between the attributes and the target based on the support set.
  • ...and 3 more figures