AI for Science: February 2026 Week 9

Feb 23 – Mar 1, 2026 · 42 papers analyzed · 3 breakthroughs

Summary

Week of 2026-02-23 to 2026-03-01. 42 papers analyzed across AI4Math and AI4Physics. 3 breakthroughs, 6 notable. Top findings: (1) 2602.22631 TorchLean — first framework to formalize neural networks in Lean 4 with a single shared IR for execution and formal verification, including mechanized reverse-mode AD correctness and explicit IEEE-754 binary32 semantics; (2) 2602.21551 — interpretable Gaussian particle representation for PDE operators with formal approximation theorems, achieving mesh-agnostic, linear-cost fluid dynamics; (3) 2602.20232 — neural wavefunction method (MōLe) that learns molecular orbital representations to reach coupled-cluster accuracy at reduced cost. Key trend: convergence of formal verification methods with ML (TorchLean, TorchLean-PINN, PC-FOL) and foundation models for quantum chemistry pushing toward coupled-cluster accuracy.

Key Takeaway

Week 9 is defined by two colliding forces: formal rigor entering ML (TorchLean, proof-by-cases benchmarks) and learned models reaching quantum chemistry gold-standard accuracy (MōLe) — the gap between what AI can compute and what it can verify is narrowing from both ends.

Breakthroughs (3)

1. TorchLean: Formalizing Neural Networks in Lean

Why Novel: Existing verification tools operate on exported ONNX/TorchScript artifacts, creating trust boundaries at every conversion step. TorchLean is the first system to make a neural network's training-time definition the formal semantic ground truth — execution, differentiation, and certificates all refer to the same object.

Key Innovations:

[object Object]
[object Object]
[object Object]
[object Object]

Evidence:

— undefined
— undefined
— undefined
— undefined
— undefined

Impact: Provides semantics-first infrastructure for fully formal neural network verification, directly addressing the semantic gap that has made existing tools unable to give end-to-end guarantees for deployed models.

2. From Basis to Basis: Gaussian Particle Representation for Interpretable PDE Operators

Why Novel: Neural operators and Transformer-based PDE solvers struggle with interpretability and localized high-frequency structures while incurring quadratic cost. This work provides the first mesh-agnostic operator with an explicit geometric state representation and formal theoretical backing for the approximation.

Key Innovations:

[object Object]
[object Object]
[object Object]

Evidence:

— undefined
— undefined
— undefined
— undefined

Impact: Opens an interpretable paradigm for PDE operator learning where the representation itself encodes physical geometry, potentially enabling more reliable scientific ML.

3. Coupled Cluster con MōLe: Molecular Orbital Learning for Neural Wavefunctions

Why Novel: DFT is fast but insufficiently accurate; CC is accurate but scales as $O(N^7)$ . Neural wavefunctions (e.g., FermiNet) improve on DFT but struggle with transferability. MōLe's learned molecular orbital approach bridges this gap with a trainable representation that generalizes across molecular geometries.

Key Innovations:

[object Object]
[object Object]
[object Object]

Evidence:

— undefined
— undefined
— undefined

Impact: If generalizable, could make coupled-cluster quality predictions accessible at DFT-like cost, which would transform computational chemistry workflows.

Trends

Formal verification meets ML: TorchLean and related work signal a maturing push to give machine-checked guarantees to neural networks, especially in safety-critical scientific applications (PINNs, controllers).
Foundation models for quantum chemistry: MACE-POLAR-1 and MōLe both push toward transferable, physically grounded representations that generalize across molecular systems — moving beyond single-system optimization.
LLM reasoning for scientific discovery: Multi-agent systems (MAESTRO for catalysis, CiteLLM for literature) show growing use of reasoning-capable LLMs as research accelerators, though formal math benchmarks (PC-FOL, QEDBENCH) reveal persistent structural reasoning gaps.
Interpretable scientific ML: The Gaussian particle PDE operator and physics-informed operator splitting both prioritize physical interpretability alongside performance, a countertrend to black-box neural operators.

Notable Papers (6)

1. Learning Physical Operators using Neural Operators

Physics-informed operator splitting decomposes PDEs into separate neural operators for nonlinear and linear components, improving generalization beyond training distributions and enabling flexible temporal discretization.

2. MACE-POLAR-1: A Polarisable Electrostatic Foundation Model for Molecular Chemistry

Extends the MACE architecture with explicit long-range electrostatic interactions and charge transfer, creating a foundation model for molecular chemistry that captures effects inaccessible to local-descriptor MLIPs.

3. Linear Reasoning vs. Proof by Cases: Obstacles for Large Language Models in FOL Problem Solving

Introduces PC-FOL, a first-order logic dataset annotated for proof-by-cases and proof-by-contradiction, revealing that LLMs systematically fail non-linear reasoning patterns critical for mathematical proof.

4. QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematics

First large-scale benchmark measuring the alignment gap between LLM-as-Judge and human expert evaluation on university-level math, finding systematic overestimation of LLM performance by automated evaluators.

5. Self-driving thin film laboratory: autonomous epitaxial atomic-layer synthesis via AI

Demonstrates a fully autonomous materials synthesis platform that reduces iterations to optimal stoichiometry by significant margins through AI-driven decision-making in a physical lab setting.

6. Reasoning-Driven Design of Single Atom Catalysts via a Multi-Agent Large Language Model Framework

Multi-agent LLM system (MAESTRO) applies chain-of-thought reasoning to electrocatalyst design, discovering single-atom catalyst candidates beyond conventional ML approaches by leveraging in-context scientific reasoning.

Honorable Mentions

Improving Reliability of Machine Learned Interatomic Potentials With Physics-Informed Pretraining ()
Pipeline for Verifying LLM-Generated Mathematical Solutions ()
Benchmarking short-range machine learning potentials for atomistic simulations of metal/electrode interfaces ()