Table of Contents
Fetching ...

MO-IOHinspector: Anytime Benchmarking of Multi-Objective Algorithms using IOHprofiler

Diederick Vermetten, Jeroen Rook, Oliver L. Preuß, Jacob de Nobel, Carola Doerr, Manuel López-Ibañez, Heike Trautmann, Thomas Bäck

TL;DR

This paper addresses the challenge of evaluating multi-objective optimization algorithms from an anytime perspective by introducing MO-IOHinspector, a module that logs all evaluated solutions using an unbounded external archive to decouple experimental design from analysis. Integrated into the IOHprofiler framework and connected with PyMOO, it enables flexible, indicator-agnostic analysis where metrics such as $HV$, $IGD^+$, and $R^2$ can be recomputed post hoc, including lazy computation and varying reference sets. The authors demonstrate the approach on ZDT and DTLZ problems with multiple MOEAs across several population sizes, revealing time-dependent performance and enabling rich visualizations (ECDFs, attainment surfaces) and robust time-aware rankings. The work provides two software components, promotes data sharing and reproducible performance studies, and outlines future directions like finite Pareto-front approximations and broader library integrations.

Abstract

Benchmarking is one of the key ways in which we can gain insight into the strengths and weaknesses of optimization algorithms. In sampling-based optimization, considering the anytime behavior of an algorithm can provide valuable insights for further developments. In the context of multi-objective optimization, this anytime perspective is not as widely adopted as in the single-objective context. In this paper, we propose a new software tool which uses principles from unbounded archiving as a logging structure. This leads to a clearer separation between experimental design and subsequent analysis decisions. We integrate this approach as a new Python module into the IOHprofiler framework and demonstrate the benefits of this approach by showcasing the ability to change indicators, aggregations, and ranking procedures during the analysis pipeline.

MO-IOHinspector: Anytime Benchmarking of Multi-Objective Algorithms using IOHprofiler

TL;DR

This paper addresses the challenge of evaluating multi-objective optimization algorithms from an anytime perspective by introducing MO-IOHinspector, a module that logs all evaluated solutions using an unbounded external archive to decouple experimental design from analysis. Integrated into the IOHprofiler framework and connected with PyMOO, it enables flexible, indicator-agnostic analysis where metrics such as , , and can be recomputed post hoc, including lazy computation and varying reference sets. The authors demonstrate the approach on ZDT and DTLZ problems with multiple MOEAs across several population sizes, revealing time-dependent performance and enabling rich visualizations (ECDFs, attainment surfaces) and robust time-aware rankings. The work provides two software components, promotes data sharing and reproducible performance studies, and outlines future directions like finite Pareto-front approximations and broader library integrations.

Abstract

Benchmarking is one of the key ways in which we can gain insight into the strengths and weaknesses of optimization algorithms. In sampling-based optimization, considering the anytime behavior of an algorithm can provide valuable insights for further developments. In the context of multi-objective optimization, this anytime perspective is not as widely adopted as in the single-objective context. In this paper, we propose a new software tool which uses principles from unbounded archiving as a logging structure. This leads to a clearer separation between experimental design and subsequent analysis decisions. We integrate this approach as a new Python module into the IOHprofiler framework and demonstrate the benefits of this approach by showcasing the ability to change indicators, aggregations, and ranking procedures during the analysis pipeline.

Paper Structure

This paper contains 14 sections, 7 figures.

Figures (7)

  • Figure 1: Schematic Overview of our benchmarking pipeline.
  • Figure 2: Evolution of hypervolume over time for the selected algorithms on selected problems. Within each subplot, algorithms have been set to a common population size/number of reference vectors. From left to right, top to bottom, the function (population size) of each figure is as follows: ZDT2 (100), ZDT6 (500), DTLZ1 (50), and ZDT5 (100). The reference point is always set to $[1.1]^d$ after normalizing the objectives ($d=2$ for ZDT, $d=3$ for DTLZ). Shaded areas show the 95% confidence intervals, lines show the mean.
  • Figure 3: Evolution of hypervolume (left) and IGD+ (right) over time for the selected parameterizations for the MOEA/D algorithm on ZDT4. The used reference set for IGD+ is taken from PyMOO, while the reference point for the hypervolume is set to $[1.1]^2$ after normalizing the objectives. Shaded areas show the 95% confidence intervals.
  • Figure 4: Example plot for ECDF of hypervolume (where the reference point is $[1.1]^d$ and then hypervolume is scaled to $[0,1]$). This plot aggregates the hypervolume over time behavior on all considered benchmark problems for the selected algorithm (NSGA-2) with different population sizes.
  • Figure 5: EAF plots of the full set of solutions evaluated by NSGA-2 with population size 10 (top left) and the final set of solutions returned by the algorithm (top right) on ZDT5. The colors indicate the fraction of runs in which a solution dominating the point was attained. The bottom plot shows the EAF-difference between these two plots, where colors correspond to the fraction of runs where a dominating solution was found in the archive, but not in the final population.
  • ...and 2 more figures