DALex: Lexicase-like Selection via Diverse Aggregation
Andrew Ni, Li Ding, Lee Spector
TL;DR
DALex introduces a matrix-based, lexicase-like selection method that uses randomly weighted aggregation of per-case errors to achieve substantial runtime speedups while preserving nearly identical problem-solving performance across domains. By sampling training-case weights from a distribution and applying batched matrix multiplication, DALex can recover lexicase behavior in the limit of infinite particularity pressure and smoothly interpolate to relaxed variants via the standard deviation of the weight distribution. Empirical results across program synthesis, image classification, symbolic regression, and learning classifier systems demonstrate that DALex matches or closely approximates lexicase-type performance with significant reductions in computation time, enabling larger populations or more generations under the same budget. The approach unifies diverse selection strategies under a single, scalable framework, with broad implications for evolutionary computation and potentially for deep learning and reinforcement learning contexts.
Abstract
Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection based on randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with large numbers of training cases. In this paper, we propose a new method that is nearly equivalent to lexicase selection in terms of the individuals that it selects, but which does so significantly more quickly. The new method, called DALex (for Diversely Aggregated Lexicase), selects the best individual with respect to a weighted sum of training case errors, where the weights are randomly sampled. This allows us to formulate the core computation required for selection as matrix multiplication instead of recursive loops of comparisons, which in turn allows us to take advantage of optimized and parallel algorithms designed for matrix multiplication for speedup. Furthermore, we show that we can interpolate between the behavior of lexicase selection and its "relaxed" variants, such as epsilon or batch lexicase selection, by adjusting a single hyperparameter, named "particularity pressure," which represents the importance granted to each individual training case. Results on program synthesis, deep learning, symbolic regression, and learning classifier systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants while maintaining almost identical problem-solving performance. Under a fixed computational budget, these savings free up resources that can be directed towards increasing population size or the number of generations, enabling the potential for solving more difficult problems.
