Table of Contents
Fetching ...

Explain Like I'm Five: Using LLMs to Improve PDE Surrogate Models with Text

Cooper Lorsung, Amir Barati Farimani

TL;DR

This work introduces a text-conditioned, multimodal PDE surrogate framework that fuses pretrained LLM embeddings with numerical data via a cross-attention block built on FactFormer. By encoding varying levels of system information as text and conditioning the surrogate on these descriptions, the approach improves next-step, autoregressive rollout, and fixed-future predictions across Heat, Burgers, Navier–Stokes, and Shallow-Water benchmarks. Key findings show SentenceTransformer embeddings often yield the best dynamic predictions, while Llama embeddings provide notable gains for end-state accuracy; ablations emphasize coefficient information as a primary driver of performance gains. The study demonstrates a scalable avenue to leverage linguistic system knowledge to enhance PDE surrogates, with future work aimed at PDE-specialized LLM pretraining and extending to other neural operator architectures.

Abstract

Solving Partial Differential Equations (PDEs) is ubiquitous in science and engineering. Computational complexity and difficulty in writing numerical solvers has motivated the development of data-driven machine learning techniques to generate solutions quickly. The recent rise in popularity of Large Language Models (LLMs) has enabled easy integration of text in multimodal machine learning models, allowing easy integration of additional system information such as boundary conditions and governing equations through text. In this work, we explore using pretrained LLMs to integrate various amounts of known system information into PDE learning. Using FactFormer as our testing backbone, we add a multimodal block to fuse numerical and textual information. We compare sentence-level embeddings, word-level embeddings, and a standard tokenizer across 2D Heat, Burgers, Navier-Stokes, and Shallow-Water data sets. These challenging benchmarks show that pretrained LLMs are able to utilize text descriptions of system information and enable accurate prediction using only initial conditions.

Explain Like I'm Five: Using LLMs to Improve PDE Surrogate Models with Text

TL;DR

This work introduces a text-conditioned, multimodal PDE surrogate framework that fuses pretrained LLM embeddings with numerical data via a cross-attention block built on FactFormer. By encoding varying levels of system information as text and conditioning the surrogate on these descriptions, the approach improves next-step, autoregressive rollout, and fixed-future predictions across Heat, Burgers, Navier–Stokes, and Shallow-Water benchmarks. Key findings show SentenceTransformer embeddings often yield the best dynamic predictions, while Llama embeddings provide notable gains for end-state accuracy; ablations emphasize coefficient information as a primary driver of performance gains. The study demonstrates a scalable avenue to leverage linguistic system knowledge to enhance PDE surrogates, with future work aimed at PDE-specialized LLM pretraining and extending to other neural operator architectures.

Abstract

Solving Partial Differential Equations (PDEs) is ubiquitous in science and engineering. Computational complexity and difficulty in writing numerical solvers has motivated the development of data-driven machine learning techniques to generate solutions quickly. The recent rise in popularity of Large Language Models (LLMs) has enabled easy integration of text in multimodal machine learning models, allowing easy integration of additional system information such as boundary conditions and governing equations through text. In this work, we explore using pretrained LLMs to integrate various amounts of known system information into PDE learning. Using FactFormer as our testing backbone, we add a multimodal block to fuse numerical and textual information. We compare sentence-level embeddings, word-level embeddings, and a standard tokenizer across 2D Heat, Burgers, Navier-Stokes, and Shallow-Water data sets. These challenging benchmarks show that pretrained LLMs are able to utilize text descriptions of system information and enable accurate prediction using only initial conditions.
Paper Structure (31 sections, 7 equations, 17 figures, 7 tables)

This paper contains 31 sections, 7 equations, 17 figures, 7 tables.

Figures (17)

  • Figure 1: Multimodal FactFormer adds system information through a multimodal block.
  • Figure 2: Cross-Attention is used to integrate sentence embeddings with data embeddings.
  • Figure 3: Comparison of next-step prediction relative $L^2$ error for baseline FactFormer, FactFormer + ST, FactFormer + Tokenizer, and Factformer + Llama with and without transfer learning.
  • Figure 4: Comparison of accumulated mean squared error for baseline FactFormer, FactFormer + ST, FactFormer + Tokenizer, and baseline + Llama with and without transfer learning.
  • Figure 5: Comparison of fixed-future prediction relative $L^2$ error for baseline FactFormer, FactFormer + ST, FactFormer + Tokenizer, and FactFormer + Llama with and without transfer learning.
  • ...and 12 more figures