Table of Contents
Fetching ...

Understanding Charge Radii with Machine Learning: Discovering Physics Expressions

B. Maheshwari, P. Van Isacker

TL;DR

This work tackles the challenge of accurately predicting nuclear charge radii while preserving interpretability by coupling high-accuracy numerical regression (LGBM and Gaussian Process Regression) with symbolic regression to extract analytic physics expressions. By engineering physically meaningful features such as $A^{1/3}$, $Z^{1/3}$, $BEA$, isospin $I$, Casten factor $CF$, and pairing gap $P$, and enforcing four-fold cross-validation, the authors obtain robust, extrapolatable predictions across the nuclear chart. The symbolic regression stage distills the learned models into compact formulas that echo liquid-drop-like terms, uncovering the dominance of $A^{1/3}$ and $Z^{1/3}$ scaling and revealing corrections tied to binding energy, isospin, and pairing effects. This hybrid data-driven approach provides interpretable, physics-informed expressions with quantified uncertainties, offering a blueprint for discovery of governing relations in nuclear structure and beyond.

Abstract

We introduce a robust, interpretable machine learning (ML) framework that combines numerical regression for high-accuracy predictions with symbolic regression to uncover the underlying physics. This hybrid approach effciently derives analytical expressions by leveraging the smoothed predictions of optimized ML models, a significant acceleration over direct symbolic regression on raw experimental data. We apply this framework, as an example, to nuclear charge radii across the nuclear chart, notably including light nuclei that are often excluded from such studies. We employ Light Gradient Boosting Machine (LGBM) and Gaussian Process Regression (GPR) models to map correlations between charge radii and key physical features: mass $A^{1/3}$ and proton number $Z^{1/3}$ dependencies, total binding energy, and for the first time, the pairing gap. Our models are rigorously trained using four-fold cross-validation with automated hyperparameter optimization, ensuring robustness and generalizability, which is critical for the typically small and skewed datasets in nuclear physics. Finally, we distill the knowledge from the initial LGBM and GPR models into simplified, interpretable mathematical expressions via symbolic regression, white-boxing these ML models. The derived formulas provide physical insights comparable to traditional many-body models and demonstrate a powerful pathway for physics expression discovery guided by ML.

Understanding Charge Radii with Machine Learning: Discovering Physics Expressions

TL;DR

This work tackles the challenge of accurately predicting nuclear charge radii while preserving interpretability by coupling high-accuracy numerical regression (LGBM and Gaussian Process Regression) with symbolic regression to extract analytic physics expressions. By engineering physically meaningful features such as , , , isospin , Casten factor , and pairing gap , and enforcing four-fold cross-validation, the authors obtain robust, extrapolatable predictions across the nuclear chart. The symbolic regression stage distills the learned models into compact formulas that echo liquid-drop-like terms, uncovering the dominance of and scaling and revealing corrections tied to binding energy, isospin, and pairing effects. This hybrid data-driven approach provides interpretable, physics-informed expressions with quantified uncertainties, offering a blueprint for discovery of governing relations in nuclear structure and beyond.

Abstract

We introduce a robust, interpretable machine learning (ML) framework that combines numerical regression for high-accuracy predictions with symbolic regression to uncover the underlying physics. This hybrid approach effciently derives analytical expressions by leveraging the smoothed predictions of optimized ML models, a significant acceleration over direct symbolic regression on raw experimental data. We apply this framework, as an example, to nuclear charge radii across the nuclear chart, notably including light nuclei that are often excluded from such studies. We employ Light Gradient Boosting Machine (LGBM) and Gaussian Process Regression (GPR) models to map correlations between charge radii and key physical features: mass and proton number dependencies, total binding energy, and for the first time, the pairing gap. Our models are rigorously trained using four-fold cross-validation with automated hyperparameter optimization, ensuring robustness and generalizability, which is critical for the typically small and skewed datasets in nuclear physics. Finally, we distill the knowledge from the initial LGBM and GPR models into simplified, interpretable mathematical expressions via symbolic regression, white-boxing these ML models. The derived formulas provide physical insights comparable to traditional many-body models and demonstrate a powerful pathway for physics expression discovery guided by ML.

Paper Structure

This paper contains 6 sections, 4 equations, 8 figures.

Figures (8)

  • Figure 1: Cycle of a robust machine-learning algorithm from predictions-plus-extrapolations to the discovery of expressions.
  • Figure 2: Machine learning (a) LGBM and (b) GPR model predicted charge radius versus experimental data.
  • Figure 3: SHAP contributions in percentage for the features used in optimizing (a) LGBM and (b) GPR models.
  • Figure 4: Correlation matrix of experimental, LGBM predicted and GPR predicted charge radii with input features.
  • Figure 5: Measured charge radii of the Li $(Z=3)$ and Be $(Z=4)$ isotopes compared with LGBM and GPR out-of-fold predictions. Also shown are the ab-initio results Forssen2009.
  • ...and 3 more figures