Table of Contents
Fetching ...

Physics-informed features in supervised machine learning

Margherita Lampani, Sabrina Guastavino, Michele Piana, Federico Benvenuto

TL;DR

This work addresses the limited interpretability of traditional feature-based supervised learning in scientific contexts by introducing physics-informed feature maps that produce dimensionally homogeneous representations within an RKHS framework. By defining a forward operator $A$ through a physics-informed map $\phi$, the authors recast learning as a regularized inverse problem and establish a theoretical link between the forward model and RKHS solutions via $\hat{f}_{\lambda} = A^{\dagger} \hat{g}_{\lambda}$. Through synthetic experiments on fluid dynamics (Bernoulli), pulsar magnetic dissipation, and binary-system classification, the method demonstrates improvements in regression and classification performance and recovers or identifies underlying physical relationships. A real-data application to solar flare forecasting shows SPIFs still provide predictive gains and highlights $PIF_2 = \Phi I$ as a key descriptor, suggesting magnetic helicity as a practical energy-distribution proxy. Overall, the physics-informed feature approach enhances explainability, supports mechanism discovery, and offers a pathway for discovering new physical equations within explainable ML.

Abstract

Supervised machine learning involves approximating an unknown functional relationship from a limited dataset of features and corresponding labels. The classical approach to feature-based machine learning typically relies on applying linear regression to standardized features, without considering their physical meaning. This may limit model explainability, particularly in scientific applications. This study proposes a physics-informed approach to feature-based machine learning that constructs non-linear feature maps informed by physical laws and dimensional analysis. These maps enhance model interpretability and, when physical laws are unknown, allow for the identification of relevant mechanisms through feature ranking. The method aims to improve both predictive performance in regression tasks and classification skill scores by integrating domain knowledge into the learning process, while also enabling the potential discovery of new physical equations within the context of explainable machine learning.

Physics-informed features in supervised machine learning

TL;DR

This work addresses the limited interpretability of traditional feature-based supervised learning in scientific contexts by introducing physics-informed feature maps that produce dimensionally homogeneous representations within an RKHS framework. By defining a forward operator through a physics-informed map , the authors recast learning as a regularized inverse problem and establish a theoretical link between the forward model and RKHS solutions via . Through synthetic experiments on fluid dynamics (Bernoulli), pulsar magnetic dissipation, and binary-system classification, the method demonstrates improvements in regression and classification performance and recovers or identifies underlying physical relationships. A real-data application to solar flare forecasting shows SPIFs still provide predictive gains and highlights as a key descriptor, suggesting magnetic helicity as a practical energy-distribution proxy. Overall, the physics-informed feature approach enhances explainability, supports mechanism discovery, and offers a pathway for discovering new physical equations within explainable ML.

Abstract

Supervised machine learning involves approximating an unknown functional relationship from a limited dataset of features and corresponding labels. The classical approach to feature-based machine learning typically relies on applying linear regression to standardized features, without considering their physical meaning. This may limit model explainability, particularly in scientific applications. This study proposes a physics-informed approach to feature-based machine learning that constructs non-linear feature maps informed by physical laws and dimensional analysis. These maps enhance model interpretability and, when physical laws are unknown, allow for the identification of relevant mechanisms through feature ranking. The method aims to improve both predictive performance in regression tasks and classification skill scores by integrating domain knowledge into the learning process, while also enabling the potential discovery of new physical equations within the context of explainable machine learning.

Paper Structure

This paper contains 9 sections, 29 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Outline of the process that implements the physics-driven solution of the feature-based machine learning problem: the measured features are transformed by a physics-informed map into an operator $A$ in a Hilbert space setting; the solution $\hat{f}_{\lambda}$ of the corresponding inverse problem provides the physical equation, which is transformed by $A$ into the predicted output $\hat{g}_{\lambda}$.
  • Figure 2: Boxplots of the IQR distributions for absolute errors (top panel) and squared errors (bottom panel) provided by ridge regression for predicting the Bernoulli equation, by varying the noise levels. The light blue and pink boxlots represent the results obtained using SPIFs and SFs, respectively.
  • Figure 3: Prediction of the Bernoulli equation. Saturation of MAE and MSE while applying the sequential feature ranking Algorithm 3.2.
  • Figure 4: Prediction of the pulsar equation. In each panel, from left to right, the IQR box-plots correspond to the result provided by ridge regression when the training is performed by using all SPIFs, all SPIFs except SPIF$_1$, and all SFs, respectively. The absolute errors and squared errors by varying the noise level are represented in the top and bottom panels, respectively.
  • Figure :
  • ...and 1 more figures