Inferring Interpretable Models of Fragmentation Functions using Symbolic Regression
Nour Makke, Sanjay Chawla
TL;DR
This work tackles the challenge of obtaining interpretable fragmentation-function (FF) forms directly from experimental data using symbolic regression (SR). By applying a transformer-based SR model to COMPASS SIDIS multiplicities, the authors extract analytic FF-like expressions, with the top univariate form $f_{ ext{SR}}(z)= a(1-z)^{c}\, ext{exp}(-b z)$ closely resembling the Lund FF and describing data across species and phase space. The study demonstrates that SR can recover meaningful, human-interpretable functions from noisy measurements, performing well in univariate and limited bivariate contexts while revealing limitations in universal multi-dimensional parameterizations. These results suggest SR-derived FF forms could serve as data-driven parameterizations in global QCD fits, offering a pathway toward interpretable, physics-grounded machine learning in high-energy phenomenology.
Abstract
Machine learning is rapidly making its path into natural sciences, including high-energy physics. We present the first study that infers, directly from experimental data, a functional form of fragmentation functions. The latter represent a key ingredient to describe physical observables measured in high-energy physics processes that involve hadron production, and predict their values at different energy. Fragmentation functions can not be calculated in theory and have to be determined instead from data. Traditional approaches rely on global fits of experimental data using a pre-assumed functional form inspired from phenomenological models to learn its parameters. This novel approach uses a ML technique, namely symbolic regression, to learn an analytical model from measured charged hadron multiplicities. The function learned by symbolic regression resembles the Lund string function and describes the data well, thus representing a potential candidate for use in global FFs fits. This study represents an approach to follow in such QCD-related phenomenology studies and more generally in sciences.
