Table of Contents
Fetching ...

Was Tournament Selection All We Ever Needed? A Critical Reflection on Lexicase Selection

Alina Geiger, Martin Briesch, Dominik Sobania, Franz Rothlauf

TL;DR

The paper investigates whether tournament selection with down-sampling can match or exceed the performance of lexicase-based methods in symbolic regression. By comparing epsilon-lexicase and tournament with random and informed down-sampling across synthetic Friedman benchmarks and real-world datasets, it quantifies performance, generalization, diversity, and solution size. The results show that down-sampling improves generalization and reduces code growth for both methods, with stronger gains for tournament; tournament with down-sampling matches epsilon-lexicase in performance while being faster, and down-sampling increases diversity for tournament. The study argues for broader evaluation of down-sampling with non-lexicase selection and suggests focusing on tournament-down-sampling as a practical, scalable alternative in symbolic regression and potentially other domains.

Abstract

The success of lexicase selection has led to various extensions, including its combination with down-sampling, which further increased performance. However, recent work found that down-sampling also leads to significant improvements in the performance of tournament selection. This raises the question of whether tournament selection combined with down-sampling is the better choice, given its faster running times. To address this question, we run a set of experiments comparing epsilon-lexicase and tournament selection with different down-sampling techniques on synthetic problems of varying noise levels and problem sizes as well as real-world symbolic regression problems. Overall, we find that down-sampling improves generalization and performance even when compared over the same number of generations. This means that down-sampling is beneficial even with way fewer fitness evaluations. Additionally, down-sampling successfully reduces code growth. We observe that population diversity increases for tournament selection when combined with down-sampling. Further, we find that tournament selection and epsilon-lexicase selection with down-sampling perform similar, while tournament selection is significantly faster. We conclude that tournament selection should be further analyzed and improved in future work instead of only focusing on the improvement of lexicase variants.

Was Tournament Selection All We Ever Needed? A Critical Reflection on Lexicase Selection

TL;DR

The paper investigates whether tournament selection with down-sampling can match or exceed the performance of lexicase-based methods in symbolic regression. By comparing epsilon-lexicase and tournament with random and informed down-sampling across synthetic Friedman benchmarks and real-world datasets, it quantifies performance, generalization, diversity, and solution size. The results show that down-sampling improves generalization and reduces code growth for both methods, with stronger gains for tournament; tournament with down-sampling matches epsilon-lexicase in performance while being faster, and down-sampling increases diversity for tournament. The study argues for broader evaluation of down-sampling with non-lexicase selection and suggests focusing on tournament-down-sampling as a practical, scalable alternative in symbolic regression and potentially other domains.

Abstract

The success of lexicase selection has led to various extensions, including its combination with down-sampling, which further increased performance. However, recent work found that down-sampling also leads to significant improvements in the performance of tournament selection. This raises the question of whether tournament selection combined with down-sampling is the better choice, given its faster running times. To address this question, we run a set of experiments comparing epsilon-lexicase and tournament selection with different down-sampling techniques on synthetic problems of varying noise levels and problem sizes as well as real-world symbolic regression problems. Overall, we find that down-sampling improves generalization and performance even when compared over the same number of generations. This means that down-sampling is beneficial even with way fewer fitness evaluations. Additionally, down-sampling successfully reduces code growth. We observe that population diversity increases for tournament selection when combined with down-sampling. Further, we find that tournament selection and epsilon-lexicase selection with down-sampling perform similar, while tournament selection is significantly faster. We conclude that tournament selection should be further analyzed and improved in future work instead of only focusing on the improvement of lexicase variants.

Paper Structure

This paper contains 11 sections, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Test fitness of the best individual for the synthetic problems across different noise levels. Outliers are not shown to improve readability.
  • Figure 2: Generalization gap between test and training fitness of the best individual for the synthetic problems across different noise levels. Outliers are not shown to improve readability.
  • Figure 3: Error diversity of individuals in a population over generations for the synthetic problems across different noise levels. Median over 30 runs is shown.
  • Figure 4: Median size of individuals in a population (measured in tree nodes) over generations for the synthetic problems across different noise levels. Median over 30 runs is shown.
  • Figure 5: Test fitness of the best individual for the synthetic Friedman1 problem with varying numbers of features (5, 10, 25, 50). Outliers are not shown to improve readability.
  • ...and 5 more figures