Exploratory Landscape Analysis for Mixed-Variable Problems

Raphael Patrick Prager; Heike Trautmann

Exploratory Landscape Analysis for Mixed-Variable Problems

Raphael Patrick Prager, Heike Trautmann

TL;DR

This work extends Exploratory Landscape Analysis (ELA) to mixed-variable problems (MVPs) by encoding categorical and hierarchical decision variables into numerical representations, enabling existing ELA feature sets to characterize MVP landscapes. It evaluates two encoding schemes—one-hot (OH) and target encoding (TE)—and applies a preprocessing pipeline to handle hierarchical dependencies, scaling, and categorical variables. An automated algorithm selection study on 702 MVPs from YAHPO Gym shows that TE-based features are faster to compute and yield superior selection performance, closing the SBS–VBS gap by approximately 57.5%. The results support the discriminative power of MVP-specific landscape features and highlight TE as the preferred encoding for future MVP landscape analysis and AAS research, while outlining avenues for improved sampling and hierarchical handling.

Abstract

Exploratory landscape analysis and fitness landscape analysis in general have been pivotal in facilitating problem understanding, algorithm design and endeavors such as automated algorithm selection and configuration. These techniques have largely been limited to search spaces of a single domain. In this work, we provide the means to compute exploratory landscape features for mixed-variable problems where the decision space is a mixture of continuous, binary, integer, and categorical variables. This is achieved by utilizing existing encoding techniques originating from machine learning. We provide a comprehensive juxtaposition of the results based on these different techniques. To further highlight their merit for practical applications, we design and conduct an automated algorithm selection study based on a hyperparameter optimization benchmark suite. We derive a meaningful compartmentalization of these benchmark problems by clustering based on the used landscape features. The identified clusters mimic the behavior the used algorithms exhibit. Meaning, the different clusters have different best performing algorithms. Finally, our trained algorithm selector is able to close the gap between the single best and the virtual best solver by 57.5% over all benchmark problems.

Exploratory Landscape Analysis for Mixed-Variable Problems

TL;DR

Abstract

Paper Structure (13 sections, 5 equations, 9 figures, 3 tables)

This paper contains 13 sections, 5 equations, 9 figures, 3 tables.

Introduction
Mixed-Variable Problems
Exploratory Landscape Analysis
Preprocessing Steps for Exploratory Landscape Analysis
Hierarchical Decision Space
Different Ranges in the Search Space
Transformation of Categorical Decision Variables
Experimental Methods
Benchmark Functions
Algorithm Portfolio
Exploratory Landscape Analysis Feature Generation
Automated Algorithm Selection
Conclusion and Outlook

Figures (9)

Figure 1: High-level overview of the different steps prior to ELA feature computation for MVP.
Figure 2: Exemplary hierarchical fitness landscape. The domain of $X_{cat}$ comprises the value a and b. $X_{cont}$ only affects the fitness landscape when $X_{cat} = b$.
Figure 3: Frequency of best performing solver per scenario. The text on the x-axis states the problem dimension, the scenario name, the number of categorical variables and the sum of the cardinality of said categorical variables. The actual occurrences are as follows: SM $448$, RS $125$, EA $97$, OP $32$.
Figure 4: Performance comparison of each individual solver contrasted to the VBS on a log scale. When a solver has an equivalent performance to the VBS for a given problem instance, the points are located on the line diagonally separating each plot. The horizontal distance to that line quantifies how much worse a specific solver is compared to the VBS. The vertical dashed line represents worst possible ERT, where only a single repetition out of $20$ reaches the target.
Figure 5: Computation time of ELA feature sets grouped by encoding and dimensionality of the problem. The x-axis depicts the total cardinality of the decision space whereas the y-axis shows the required time in seconds to calculate a respective feature set.
...and 4 more figures

Exploratory Landscape Analysis for Mixed-Variable Problems

TL;DR

Abstract

Exploratory Landscape Analysis for Mixed-Variable Problems

Authors

TL;DR

Abstract

Table of Contents

Figures (9)