Shape Constraints in Symbolic Regression using Penalized Least Squares
Viktor Martinek, Julia Reuter, Ophelia Frotscher, Sanaz Mostaghim, Markus Richter, Roland Herzog
TL;DR
The paper addresses how to incorporate shape constraints into symbolic regression by penalizing SC violations during the parameter identification step using gradient-based, second-order optimization. It formulates SC as constraints evaluated at a finite set of points and combines their penalties with the SR loss in a soft-constrained framework, implemented within the TiSR NSGA-II-based platform. Through experiments on Gaussian, magman, and van der Waals problems under varying noise and data scarcity, the approach minimizes SC violations during fitting (minim_obj) and is compared to a baseline and a post-hoc SC-penalized variant (obj). Results show that SC helps most when data are limited, with minim_obj providing statistically significant gains in some cases while never performing worse overall, indicating practical utility for extrapolation and prior-knowledge integration in data-sparse regimes. The work points to future extensions to empirical datasets and broader SC types to further amplify benefits in real-world applications.
Abstract
We study the addition of shape constraints (SC) and their consideration during the parameter identification step of symbolic regression (SR). SC serve as a means to introduce prior knowledge about the shape of the otherwise unknown model function into SR. Unlike previous works that have explored SC in SR, we propose minimizing SC violations during parameter identification using gradient-based numerical optimization. We test three algorithm variants to evaluate their performance in identifying three symbolic expressions from synthetically generated data sets. This paper examines two benchmark scenarios: one with varying noise levels and another with reduced amounts of training data. The results indicate that incorporating SC into the expression search is particularly beneficial when data is scarce. Compared to using SC only in the selection process, our approach of minimizing violations during parameter identification shows a statistically significant benefit in some of our test cases, without being significantly worse in any instance.
