Table of Contents
Fetching ...

Best Practices For Empirical Meta-Algorithmic Research Guidelines from the COSEAL Research Network

Theresa Eimer, Lennart Schäpermeier, André Biedenkapp, Alexander Tornede, Lars Kotthoff, Pieter Leyman, Matthias Feurer, Katharina Eggensperger, Kaitlin Maile, Tanja Tornede, Anna Kozak, Ke Xue, Marcel Wever, Mitra Baratchi, Damir Pulatov, Heike Trautmann, Haniye Kashgarani, Marius Lindauer

TL;DR

This paper provides a comprehensive, living set of best-practice guidelines for empirical meta-algorithmic research across the COSEAL community. It covers the entire experimental lifecycle—from formulating research questions (exploratory vs confirmatory) to designing fair evaluations, building reproducible software, and interpreting results with robust visualizations and statistics. The work emphasizes using solid baselines and benchmarks, leveraging surrogate and synthetic benchmarks judiciously, and ensuring reproducibility through standardized data formats, end-to-end pipelines, and open-source software. By detailing concrete examples and common pitfalls, it aims to raise the reliability, efficiency, and societal relevance of meta-algorithmic research while inviting ongoing community refinement.

Abstract

Empirical research on meta-algorithmics, such as algorithm selection, configuration, and scheduling, often relies on extensive and thus computationally expensive experiments. With the large degree of freedom we have over our experimental setup and design comes a plethora of possible error sources that threaten the scalability and validity of our scientific insights. Best practices for meta-algorithmic research exist, but they are scattered between different publications and fields, and continue to evolve separately from each other. In this report, we collect good practices for empirical meta-algorithmic research across the subfields of the COSEAL community, encompassing the entire experimental cycle: from formulating research questions and selecting an experimental design, to executing ex- periments, and ultimately, analyzing and presenting results impartially. It establishes the current state-of-the-art practices within meta-algorithmic research and serves as a guideline to both new researchers and practitioners in meta-algorithmic fields.

Best Practices For Empirical Meta-Algorithmic Research Guidelines from the COSEAL Research Network

TL;DR

This paper provides a comprehensive, living set of best-practice guidelines for empirical meta-algorithmic research across the COSEAL community. It covers the entire experimental lifecycle—from formulating research questions (exploratory vs confirmatory) to designing fair evaluations, building reproducible software, and interpreting results with robust visualizations and statistics. The work emphasizes using solid baselines and benchmarks, leveraging surrogate and synthetic benchmarks judiciously, and ensuring reproducibility through standardized data formats, end-to-end pipelines, and open-source software. By detailing concrete examples and common pitfalls, it aims to raise the reliability, efficiency, and societal relevance of meta-algorithmic research while inviting ongoing community refinement.

Abstract

Empirical research on meta-algorithmics, such as algorithm selection, configuration, and scheduling, often relies on extensive and thus computationally expensive experiments. With the large degree of freedom we have over our experimental setup and design comes a plethora of possible error sources that threaten the scalability and validity of our scientific insights. Best practices for meta-algorithmic research exist, but they are scattered between different publications and fields, and continue to evolve separately from each other. In this report, we collect good practices for empirical meta-algorithmic research across the subfields of the COSEAL community, encompassing the entire experimental cycle: from formulating research questions and selecting an experimental design, to executing ex- periments, and ultimately, analyzing and presenting results impartially. It establishes the current state-of-the-art practices within meta-algorithmic research and serves as a guideline to both new researchers and practitioners in meta-algorithmic fields.

Paper Structure

This paper contains 97 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Schematic comparison of a simple barchart vs. piechart. Barcharts allow to easily see which approach performs best and quantify performance differences, whereas the piechart is harder to read. Additionally, the piechart requires the use of another aesthetic (here: color fill) to distinguish groups. This would allow the barchart to include further information or (here) be presented in a more minimalist fashion focusing on the data.
  • Figure 2: jet, turbo and viridis color scales (top-to-bottom) in color (left) and graytone (right). Of the two rainbow color palettes, jet introduces visual artifacts in forms of brightness jumps, while turbo's transition between colors is smooth. In contrast to both, viridis is perceptually linear, i.e., brighter colors always correspond to higher values.
  • Figure 3: Schematic comparison of plotting distributed vs. aggregated data using a boxplot of individual run data and a barchart of aggregated mean values. The boxplot visualization allows us to see that while method A performs better in the median, method B is competitive in some cases as well, while they are both clearly better than methods C and D.
  • Figure 4: Schematic comparison of aggregated performance over time vs. a fixed point in time. The convergence plot shows that, depending on the time allowed, different approaches perform best regarding their average performance. For example, Method B outperforms Method A for time budgets in the range of 100-300, while Method A outperforms B after the full time scale of this (hypothetical) experiment.