Table of Contents
Fetching ...

Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization

Szymon Sieradzki, Jacek Mańdziuk

TL;DR

The paper investigates enhancements to the Adaptive Tree Structured Parzen Estimator (ATPE) for hyperparameter optimization, introducing methods to smooth discontinuities, diversify sample filtering, enrich surrogate components, and extend blocking to categorical hyperparameters. It shows that some modifications (e.g., ATPE‑r, ATPE‑f, ATPE‑cf, ATPE‑c) yield improvements on several benchmarks, while interactions between modifications can be adverse when combined. Extended experiments reveal that ATPE‑cf and ATPE‑cf‑zscore often deliver the strongest median improvements and, in many cases, mean improvements as well, though results are benchmark‑dependent. The work highlights that targeted, task‑dependent combinations of ATPE enhancements can enhance search efficiency, while underscoring the need for careful evaluation of interactions and surrogate design in HPO.

Abstract

In this paper, we review hyperparameter optimization methods for machine learning models, with a particular focus on the Adaptive Tree-Structured Parzen Estimator (ATPE) algorithm. We propose several modifications to ATPE and assess their efficacy on a diverse set of standard benchmark functions. Experimental results demonstrate that the proposed modifications significantly improve the effectiveness of ATPE hyperparameter optimization on selected benchmarks, a finding that holds practical relevance for their application in real-world machine learning / optimization tasks.

Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization

TL;DR

The paper investigates enhancements to the Adaptive Tree Structured Parzen Estimator (ATPE) for hyperparameter optimization, introducing methods to smooth discontinuities, diversify sample filtering, enrich surrogate components, and extend blocking to categorical hyperparameters. It shows that some modifications (e.g., ATPE‑r, ATPE‑f, ATPE‑cf, ATPE‑c) yield improvements on several benchmarks, while interactions between modifications can be adverse when combined. Extended experiments reveal that ATPE‑cf and ATPE‑cf‑zscore often deliver the strongest median improvements and, in many cases, mean improvements as well, though results are benchmark‑dependent. The work highlights that targeted, task‑dependent combinations of ATPE enhancements can enhance search efficiency, while underscoring the need for careful evaluation of interactions and surrogate design in HPO.

Abstract

In this paper, we review hyperparameter optimization methods for machine learning models, with a particular focus on the Adaptive Tree-Structured Parzen Estimator (ATPE) algorithm. We propose several modifications to ATPE and assess their efficacy on a diverse set of standard benchmark functions. Experimental results demonstrate that the proposed modifications significantly improve the effectiveness of ATPE hyperparameter optimization on selected benchmarks, a finding that holds practical relevance for their application in real-world machine learning / optimization tasks.

Paper Structure

This paper contains 23 sections, 16 equations, 10 figures, 10 tables.

Figures (10)

  • Figure 1: Information flow in ATPE
  • Figure 2: Simulated function - Gaussian product.
  • Figure 3: Simulated function - sine product.
  • Figure 4: The progression of secondaryCutoff values in one optimization using ATPE.
  • Figure 5: Visualization of hyperparameters selected for blocking based on secondaryCutoff values in ATPE.
  • ...and 5 more figures