Table of Contents
Fetching ...

Interpretable Machine Learning in Physics: A Review

Sebastian Johann Wetzel, Seungwoong Ha, Raban Iten, Miriam Klopotek, Ziming Liu

TL;DR

This review surveys interpretability in ML as applied to physics, outlining why transparent models are essential for trust, debugging, and scientific understanding. It organizes concepts into notions of interpretation, philosophical perspectives, and a comprehensive catalog of algorithms, interpretability methods, and domain-specific applications across quantum, classical, high-energy, astrophysical, and complex systems. Key contributions include frameworks for distinguishing intrinsic versus post-hoc interpretability, and a synthesis of symbolic regression, Hamiltonian/Lagrangian-inspired networks, and symmetry/conservation discoveries that yield human-readable physical insights. By connecting physical principles with interpretable ML techniques, the work highlights how transparent representations can drive reliable discoveries and practical advancements in experimental and theoretical physics. The field is positioned as poised to enable AI-augmented scientific inference that remains aligned with human understanding and scientific rigor.

Abstract

Machine learning is increasingly transforming various scientific fields, enabled by advancements in computational power and access to large data sets from experiments and simulations. As artificial intelligence (AI) continues to grow in capability, these algorithms will enable many scientific discoveries beyond human capabilities. Since the primary goal of science is to understand the world around us, fully leveraging machine learning in scientific discovery requires models that are interpretable -- allowing experts to comprehend the concepts underlying machine-learned predictions. Successful interpretations increase trust in black-box methods, help reduce errors, allow for the improvement of the underlying models, enhance human-AI collaboration, and ultimately enable fully automated scientific discoveries that remain understandable to human scientists. This review examines the role of interpretability in machine learning applied to physics. We categorize different aspects of interpretability, discuss machine learning models in terms of both interpretability and performance, and explore the philosophical implications of interpretability in scientific inquiry. Additionally, we highlight recent advances in interpretable machine learning across many subfields of physics. By bridging boundaries between disciplines -- each with its own unique insights and challenges -- we aim to establish interpretable machine learning as a core research focus in science.

Interpretable Machine Learning in Physics: A Review

TL;DR

This review surveys interpretability in ML as applied to physics, outlining why transparent models are essential for trust, debugging, and scientific understanding. It organizes concepts into notions of interpretation, philosophical perspectives, and a comprehensive catalog of algorithms, interpretability methods, and domain-specific applications across quantum, classical, high-energy, astrophysical, and complex systems. Key contributions include frameworks for distinguishing intrinsic versus post-hoc interpretability, and a synthesis of symbolic regression, Hamiltonian/Lagrangian-inspired networks, and symmetry/conservation discoveries that yield human-readable physical insights. By connecting physical principles with interpretable ML techniques, the work highlights how transparent representations can drive reliable discoveries and practical advancements in experimental and theoretical physics. The field is positioned as poised to enable AI-augmented scientific inference that remains aligned with human understanding and scientific rigor.

Abstract

Machine learning is increasingly transforming various scientific fields, enabled by advancements in computational power and access to large data sets from experiments and simulations. As artificial intelligence (AI) continues to grow in capability, these algorithms will enable many scientific discoveries beyond human capabilities. Since the primary goal of science is to understand the world around us, fully leveraging machine learning in scientific discovery requires models that are interpretable -- allowing experts to comprehend the concepts underlying machine-learned predictions. Successful interpretations increase trust in black-box methods, help reduce errors, allow for the improvement of the underlying models, enhance human-AI collaboration, and ultimately enable fully automated scientific discoveries that remain understandable to human scientists. This review examines the role of interpretability in machine learning applied to physics. We categorize different aspects of interpretability, discuss machine learning models in terms of both interpretability and performance, and explore the philosophical implications of interpretability in scientific inquiry. Additionally, we highlight recent advances in interpretable machine learning across many subfields of physics. By bridging boundaries between disciplines -- each with its own unique insights and challenges -- we aim to establish interpretable machine learning as a core research focus in science.

Paper Structure

This paper contains 69 sections, 5 equations, 9 figures.

Figures (9)

  • Figure 1: A scientist attempts to peep into the black box of artificial intelligence.
  • Figure 2: In the near future, AI is expected to drive scientific discoveries beyond the capabilities of human scientists. In science, the primary goal is to facilitate a human understanding of new and unknown concepts. To bridge the gap between AI-generated insights and human understanding, it is crucial to interpret AI systems. By doing so, human scientists can access and integrate knowledge that lies at the intersection of AI-discoverable knowledge and human-verifiable understanding -- unlocking scientific concepts beyond what humans alone can discover.
  • Figure 3: How about "Overview of the scientific scope of the paper: The physics sub-fields, the most frequently reported algorithms, and overarching concepts addressable with or useful for interpretable machine learning in physics". The background image is credited to Lawrence Berkeley Lab quark_picture.
  • Figure 4: Machine learning methods and their intrinsic interpretability on a 5-point scale. Depicted algorithms and their interpretability scores are NN (Neural network, 1), GAN (Generative adversarial network, 2), BM (Boltzmann machine, 2), RL (Reinforcement learning, 2), AE (Autoencoder, 3), RC (Reservoir computing, 3), GNN (Graph neural network, 3), SENN (Self-explaining neural network, 3), SVM (Support vector machine, 4), Decision tree (4), PCA (Principal Component Analysis, 5), and Lin. Reg. (Linear regression, 5) respectively (from left to right).
  • Figure 5: Three representative methods for symbolic regression. (a) PySR uses genetic programming to evolve symbolic formulas, similar to the survival of the fittest principle in Darwinism evolution cranmer2023interpretable. (b) Equation Learner (EQL) replaces activation functions in multi-layer perceptrons with hand-encoded symbolic functions (e.g., sin, exp, multiplications) martius2016extrapolation. (C) Kolmogorov-Arnold Network (KAN) has learnable activation functions parameterized by B-splines, which are matched and converted to symbolic functions after training liu2024kan.
  • ...and 4 more figures