Sharpness-Aware Minimization in Genetic Programming

Illya Bakurov; Nathan Haut; Wolfgang Banzhaf

Sharpness-Aware Minimization in Genetic Programming

Illya Bakurov, Nathan Haut, Wolfgang Banzhaf

TL;DR

This work transfers Sharpness-Aware Minimization (SAM) from deep learning to tree Genetic Programming to curb overfitting and unstable interpolation in small-data regimes. It introduces two SAM-based adaptations: SAM-IN, which perturbs terminals (constants and inputs) by a magnitude $\epsilon$ and uses randomized double tournament selection to balance fitness and sharpness, and SAM-OUT, which perturbs program semantics via a normalized geometric semantic mutation (GSM) neighborhood with $ms=\epsilon$ to estimate sharpness as a variance across neighbors, without re-evaluating the reference. Evaluations across four real-world regression tasks and four synthetic functions show that both SAM variants produce notably smaller and less redundant trees while maintaining or improving generalization on real data; SAM-IN often yields better generalization, whereas SAM-OUT offers computational efficiency and strong performance on several problems. The results suggest that incorporating sharpness-aware criteria into GP enhances stability and trustworthiness, and the framework can be extended to other discrete or symbolic learning settings with minimal architectural constraints, given suitable perturbation schemes.

Abstract

Sharpness-Aware Minimization (SAM) was recently introduced as a regularization procedure for training deep neural networks. It simultaneously minimizes the fitness (or loss) function and the so-called fitness sharpness. The latter serves as a measure of the nonlinear behavior of a solution and does so by finding solutions that lie in neighborhoods having uniformly similar loss values across all fitness cases. In this contribution, we adapt SAM for tree Genetic Programming (TGP) by exploring the semantic neighborhoods of solutions using two simple approaches. By capitalizing upon perturbing input and output of program trees, sharpness can be estimated and used as a second optimization criterion during the evolution. To better understand the impact of this variant of SAM on TGP, we collect numerous indicators of the evolutionary process, including generalization ability, complexity, diversity, and a recently proposed genotype-phenotype mapping to study the amount of redundancy in trees. The experimental results demonstrate that using any of the two proposed SAM adaptations in TGP allows (i) a significant reduction of tree sizes in the population and (ii) a decrease in redundancy of the trees. When assessed on real-world benchmarks, the generalization ability of the elite solutions does not deteriorate.

Sharpness-Aware Minimization in Genetic Programming

TL;DR

and uses randomized double tournament selection to balance fitness and sharpness, and SAM-OUT, which perturbs program semantics via a normalized geometric semantic mutation (GSM) neighborhood with

to estimate sharpness as a variance across neighbors, without re-evaluating the reference. Evaluations across four real-world regression tasks and four synthetic functions show that both SAM variants produce notably smaller and less redundant trees while maintaining or improving generalization on real data; SAM-IN often yields better generalization, whereas SAM-OUT offers computational efficiency and strong performance on several problems. The results suggest that incorporating sharpness-aware criteria into GP enhances stability and trustworthiness, and the framework can be extended to other discrete or symbolic learning settings with minimal architectural constraints, given suitable perturbation schemes.

Abstract

Paper Structure (12 sections, 9 figures, 5 tables, 2 algorithms)

This paper contains 12 sections, 9 figures, 5 tables, 2 algorithms.

Introduction
Related Work
Sharpness-Aware Minimization in Deep Learning
Semantic Awareness in Genetic Programming
Noisy Data and Fitness Functions
SAM in GP
Proposed Approach
SAM on Input (SAM-IN)
SAM on Output (SAM-OUT)
Experimental settings
Experimental results
Conclusions

Figures (9)

Figure 1: A sharp and smooth model is compared along with the values that were returned by our SAM-IN metric, showing that the sharper model is clearly identified by the metric.
Figure 2: An example of a GP tree. The red arrows indicate the locations where noise is injected.
Figure 3: An example of using sharpness to generate a smoother approximation (left) of data compared to the original data generator (Ackley function - right). This would be useful in scenarios where the goal is to use the model to predict the location of a global optima.
Figure 4: Average population training (left) and test (right) fitness of SAM approaches vs. standard GP on the 4 real-world datasets.
Figure 5: Average population training (left) and test (right) fitness of SAM approaches vs. standard GP on the 4 synthetic datasets.
...and 4 more figures

Sharpness-Aware Minimization in Genetic Programming

TL;DR

Abstract

Sharpness-Aware Minimization in Genetic Programming

Authors

TL;DR

Abstract

Table of Contents

Figures (9)