PDE-SHARP: PDE Solver Hybrids through Analysis and Refinement Passes
Shaghayegh Fazliani, Madeleine Udell
TL;DR
PDE-SHARP presents an end-to-end framework for generating high-accuracy PDE solvers with dramatically reduced computational cost by replacing most numerical evaluation with structured LLM reasoning. The method comprises three stages—Analysis (mathematical reasoning and stability checks), Genesis (solver candidate generation), and Synthesis (LLM-based collaborative selection and hybridization)—and supports flexible feedback (numerical residuals, relative errors, or no feedback). Across five PDEBench tasks, PDE-SHARP reduces solver evaluations by 60-75% and improves geometric-mean accuracy by more than a factor of four, with robust performance across multiple LLM families and code-generation strategies. A detailed case study on reaction–diffusion demonstrates that minor, insight-driven refinements can yield dramatic accuracy gains (e.g., 77× improvement in L2 error), underscoring the value of mathematics-informed synthesis over brute-force sampling. The work also documents improved debugging efficiency and code-quality metrics, indicating practical benefits for production-grade PDE solver generation and potential extensions to higher-dimensional problems and hybrid neural-numeric approaches.
Abstract
Current LLM-driven approaches using test-time computing to generate PDE solvers execute a large number of solver samples to identify high-accuracy solvers. These paradigms are especially costly for complex PDEs requiring substantial computational resources for numerical evaluation. We introduce PDE-SHARP, a framework to reduce computational costs by replacing expensive scientific computation by cheaper LLM inference that achieves superior solver accuracy with 60-75% fewer computational evaluations. PDE-SHARP employs three stages: (1) Analysis: mathematical chain-of-thought analysis including PDE classification, solution type detection, and stability analysis; (2) Genesis: solver generation based on mathematical insights from the previous stage; and (3) Synthesis: collaborative selection-hybridization tournaments in which LLM judges iteratively refine implementations through flexible performance feedback. To generate high-quality solvers, PDE-SHARP requires fewer than 13 solver evaluations on average compared to 30+ for baseline methods, improving accuracy uniformly across tested PDEs by $4\times$ on average, and demonstrates robust performance across LLM architectures, from general-purpose to specialized reasoning models.
