Table of Contents
Fetching ...

Architecture as physical prior: cooperative neural network for nuclear masses

Peiwen Zai, Wei Cheng, Feng-Shou Zhang

TL;DR

The results demonstrate that physically motivated architectural constraints can effectively substitute for feature engineering, establishing architecture as physical prior as a promising paradigm for neural-network mass modeling.

Abstract

Machine learning approaches to nuclear mass prediction have achieved remarkable accuracy, but typically rely on existing theoretical baselines or hand-crafted physics features. Here we demonstrate that these prerequisites can be supplanted by structural inductive biases embedded directly in the network architecture. We present the Cooperative Neural Network (CoNN), which predicts binding energies from raw proton and neutron numbers (Z,N) alone by additively combining four structurally constrained modules: a smooth network for bulk liquid-drop trends, discrete scalar embeddings for shell effects, a learnable two-dimensional grid for regional collective correlations, and a parity-aware network for odd--even staggering. On the AME2020 dataset, the CoNN achieves a root-mean-square deviation of 0.269 MeV across all 3558 nuclei, with 0.419 MeV on a held-out interpolation subset and 0.728 MeV on 122 nuclei newly measured since AME2016, placing it among the most accurate baseline-free approaches to direct mass prediction. Notably, the learned embeddings develop pronounced extrema at canonical magic numbers and the pairing module reproduces the expected odd--even staggering along isotopic chains, both emerging from the data without explicit supervision. These results demonstrate that physically motivated architectural constraints can effectively substitute for feature engineering, establishing architecture as physical prior as a promising paradigm for neural-network mass modeling.

Architecture as physical prior: cooperative neural network for nuclear masses

TL;DR

The results demonstrate that physically motivated architectural constraints can effectively substitute for feature engineering, establishing architecture as physical prior as a promising paradigm for neural-network mass modeling.

Abstract

Machine learning approaches to nuclear mass prediction have achieved remarkable accuracy, but typically rely on existing theoretical baselines or hand-crafted physics features. Here we demonstrate that these prerequisites can be supplanted by structural inductive biases embedded directly in the network architecture. We present the Cooperative Neural Network (CoNN), which predicts binding energies from raw proton and neutron numbers (Z,N) alone by additively combining four structurally constrained modules: a smooth network for bulk liquid-drop trends, discrete scalar embeddings for shell effects, a learnable two-dimensional grid for regional collective correlations, and a parity-aware network for odd--even staggering. On the AME2020 dataset, the CoNN achieves a root-mean-square deviation of 0.269 MeV across all 3558 nuclei, with 0.419 MeV on a held-out interpolation subset and 0.728 MeV on 122 nuclei newly measured since AME2016, placing it among the most accurate baseline-free approaches to direct mass prediction. Notably, the learned embeddings develop pronounced extrema at canonical magic numbers and the pairing module reproduces the expected odd--even staggering along isotopic chains, both emerging from the data without explicit supervision. These results demonstrate that physically motivated architectural constraints can effectively substitute for feature engineering, establishing architecture as physical prior as a promising paradigm for neural-network mass modeling.
Paper Structure (10 sections, 5 equations, 7 figures, 2 tables)

This paper contains 10 sections, 5 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Distribution of nuclei on the $(N,Z)$ chart. Gray: training set (80% of the AME2016 pool); red: validation set (remaining 20%); blue: extrapolation set (nuclei in AME2020 but absent from AME2016). Dashed lines indicate magic numbers.
  • Figure 2: Architecture of the CoNN model. The binding energy is decomposed into a macroscopic contribution from the bulk-properties network and microscopic corrections from three modules: shell embeddings, a regional correlation grid, and a pairing network. The four outputs are summed to yield $B_{\mathrm{pred}}$.
  • Figure 3: Residual maps ($B_{\mathrm{pred}}-B_{\mathrm{exp}}$, in MeV) on the $(N,Z)$ chart. (a) FRDM2012 spherical macroscopic term. (b) CoNN macroscopic branch only. (c) CoNN Full model. (d) CoNN without the pairing network. All RMSDs are evaluated on the AME2016-overlap set ($n=3436$), excluding the 122 extrapolation nuclei on the boundary.
  • Figure 4: Learned proton (top) and neutron (bottom) embedding biases from the CoNN. Dashed vertical lines mark canonical magic numbers. The embeddings develop clear structure near shell closures without explicit magic-number supervision.
  • Figure 5: Learned correlation grid $E_{\mathrm{Cor}}(Z,N)$ on the $(N,Z)$ chart. Dashed lines mark magic numbers. Localized patches near doubly-magic nuclei reflect non-separable proton--neutron correlations, while extended structures in mid-shell regions correspond to collective deformation.
  • ...and 2 more figures