Lexicase Selection Parameter Analysis: Varying Population Size and Test Case Redundancy with Diagnostic Metrics
Jose Guadalupe Hernandez, Anil Kumar Saini, Jason H. Moore
TL;DR
The paper investigates how hidden parameters, notably population size and test-case redundancy, influence Lexicase selection under a fixed evaluation budget. It employs the DOSSIER diagnostic suite, focusing on exploitation rate and contradictory objectives diagnostics, plus a redundancy-extension, to quantify exploitation, specialist maintenance, and the impact of duplicate test cases. Key findings show that smaller populations enable faster exploitation, while larger populations preserve more specialists; redundancy often hampers specialist optimization, especially at large population sizes, revealing a nuanced budget-aware tradeoff. The work provides practical guidance for tuning Lexicase alongside budget constraints and highlights the importance of problem-specific configurations for achieving optimal performance.
Abstract
Lexicase selection is a successful parent selection method in genetic programming that has outperformed other methods across multiple benchmark suites. Unlike other selection methods that require explicit parameters to function, such as tournament size in tournament selection, lexicase selection does not. However, if evolutionary parameters like population size and number of generations affect the effectiveness of a selection method, then lexicase's performance may also be impacted by these `hidden' parameters. Here, we study how these hidden parameters affect lexicase's ability to exploit gradients and maintain specialists using diagnostic metrics. By varying the population size with a fixed evaluation budget, we show that smaller populations tend to have greater exploitation capabilities, whereas larger populations tend to maintain more specialists. We also consider the effect redundant test cases have on specialist maintenance, and find that high redundancy may hinder the ability to optimize and maintain specialists, even for larger populations. Ultimately, we highlight that population size, evaluation budget, and test cases must be carefully considered for the characteristics of the problem being solved.
