Towards Verifiable and Self-Correcting AI Physicists for Quantum Many-Body Simulations

Ken Deng; Xiangfei Wang; Guijing Duan; Chen Mo; Junkun Huang; Runqing Zhang; Ling Qian; Zhiguo Huang; Jize Han; Di Luo

Towards Verifiable and Self-Correcting AI Physicists for Quantum Many-Body Simulations

Ken Deng, Xiangfei Wang, Guijing Duan, Chen Mo, Junkun Huang, Runqing Zhang, Ling Qian, Zhiguo Huang, Jize Han, Di Luo

Abstract

Recent advances in automated scientific discovery have shown remarkable promise across frontier research domains, with agent systems driven by large language models (LLMs) emerging as powerful tools for physics research. However, in practical applications, LLM scientific research is prone to hallucinations, highlighting the need for reliable verification and error-correction mechanisms. Here we introduce PhysVEC, an automated multi-agent framework for verifiable and error-correcting AI-driven physics research. PhysVEC incorporates a programming verifier and a scientific verifier to ensure both coding correctness and physical validity, and provides human-auditable evidence at each stage. We curate QMB100, an end-to-end research-level benchmark dataset consisting of $100$ tasks extracted from $21$ high impact articles that focus on quantum many-body physics. We evaluated PhysVEC with four frontier LLMs and found that it significantly outperformed baselines in both programming tests and scientific tests across all LLMs and task categories. PhysVEC demonstrates effective inference-time scaling and delivers accurate physical predictions through integrated verification and error-correction mechanisms, paving the way for reliable and interpretable AI physicists.

Towards Verifiable and Self-Correcting AI Physicists for Quantum Many-Body Simulations

Abstract

Towards Verifiable and Self-Correcting AI Physicists for Quantum Many-Body Simulations

Abstract

Paper Structure

Table of Contents

Figures (8)