Table of Contents
Fetching ...

F-Adapter: Frequency-Adaptive Parameter-Efficient Fine-Tuning in Scientific Machine Learning

Hangwei Zhang, Chun Kang, Yan Wang, Difan Zou

TL;DR

This work pioneers parameter-efficient fine-tuning for pretrained scientific machine-learning operator models, revealing a depth-driven error floor for LoRA in Fourier-based backbones and showing adapters can circumvent it via universal approximation in the spectral domain. By exploiting the spectral energy concentration of PDE solutions, the authors design Frequency-Adaptive Adapters (F-Adapters) that allocate more capacity to low-frequency bands and less to high-frequency bands. The approach achieves state-of-the-art accuracy on challenging 3D Navier–Stokes benchmarks with only a small fraction of tunable parameters, and theoretical results link spectral decay to improved approximation with adapters. This frequency-aware PEFT paradigm offers a practical, scalable route to fine-tune large SciML models across diverse PDE tasks while preserving spectral fidelity and efficiency.

Abstract

Parameter-efficient fine-tuning (PEFT) of powerful pre-trained models for complex downstream tasks has proven effective in vision and language processing, yet this paradigm remains unexplored in scientific machine learning, where the objective is to model complex physical systems. We conduct the first systematic study of PEFT for pre-trained Large Operator Models (LOMs) obtained by scaling variants of Fourier Neural Operator. First, we observe that the widely used Low-Rank Adaptation (LoRA) yields markedly poorer performance on LOMs than Adapter tuning. Then, we further theoretically establish that stacked LoRA incurs a depth-amplified lower bound on approximation error within Fourier layers, whereas adapters retain universal approximation capacity and, by concentrating parameters on energy-dominant low-frequency modes, attain exponentially decaying error with bottleneck width in the Fourier domain. Motivated by the robust empirical gains of adapters and by our theoretical characterization of PDE solutions as spectrally sparse, we introduce Frequency-Adaptive Adapter (F-Adapter). F-Adapter allocates adapter capacity based on spectral complexity, assigning higher-dimension modules to low-frequency components and lower-dimension modules to high-frequency components. Our F-Adapters establish state-of-the-art (SOTA) results on multiple challenging 3D Navier-Stokes benchmarks, markedly enhancing both generalization and spectral fidelity over LoRA and other PEFT techniques commonly used in LLMs. To the best of our knowledge, this work is the first to explore PEFT for scientific machine-learning and establishes F-Adapter as an effective paradigm for this domain.

F-Adapter: Frequency-Adaptive Parameter-Efficient Fine-Tuning in Scientific Machine Learning

TL;DR

This work pioneers parameter-efficient fine-tuning for pretrained scientific machine-learning operator models, revealing a depth-driven error floor for LoRA in Fourier-based backbones and showing adapters can circumvent it via universal approximation in the spectral domain. By exploiting the spectral energy concentration of PDE solutions, the authors design Frequency-Adaptive Adapters (F-Adapters) that allocate more capacity to low-frequency bands and less to high-frequency bands. The approach achieves state-of-the-art accuracy on challenging 3D Navier–Stokes benchmarks with only a small fraction of tunable parameters, and theoretical results link spectral decay to improved approximation with adapters. This frequency-aware PEFT paradigm offers a practical, scalable route to fine-tune large SciML models across diverse PDE tasks while preserving spectral fidelity and efficiency.

Abstract

Parameter-efficient fine-tuning (PEFT) of powerful pre-trained models for complex downstream tasks has proven effective in vision and language processing, yet this paradigm remains unexplored in scientific machine learning, where the objective is to model complex physical systems. We conduct the first systematic study of PEFT for pre-trained Large Operator Models (LOMs) obtained by scaling variants of Fourier Neural Operator. First, we observe that the widely used Low-Rank Adaptation (LoRA) yields markedly poorer performance on LOMs than Adapter tuning. Then, we further theoretically establish that stacked LoRA incurs a depth-amplified lower bound on approximation error within Fourier layers, whereas adapters retain universal approximation capacity and, by concentrating parameters on energy-dominant low-frequency modes, attain exponentially decaying error with bottleneck width in the Fourier domain. Motivated by the robust empirical gains of adapters and by our theoretical characterization of PDE solutions as spectrally sparse, we introduce Frequency-Adaptive Adapter (F-Adapter). F-Adapter allocates adapter capacity based on spectral complexity, assigning higher-dimension modules to low-frequency components and lower-dimension modules to high-frequency components. Our F-Adapters establish state-of-the-art (SOTA) results on multiple challenging 3D Navier-Stokes benchmarks, markedly enhancing both generalization and spectral fidelity over LoRA and other PEFT techniques commonly used in LLMs. To the best of our knowledge, this work is the first to explore PEFT for scientific machine-learning and establishes F-Adapter as an effective paradigm for this domain.

Paper Structure

This paper contains 73 sections, 10 theorems, 117 equations, 8 figures, 15 tables, 1 algorithm.

Key Result

Proposition 3.1

Let $\Delta W_{\mathrm{g}} =\operatorname{blockdiag}\!\bigl(\Delta W^{(1)},\dots,\Delta W^{(K)}\bigr)$ be the block-wise model parameter updates and $BA =\operatorname{blockdiag}\!\bigl(B^{(1)}A^{(1)},\dots,B^{(K)}A^{(K)}\bigr)$ be the block-wise low-rank approximation, where $B^{(k)}\!\in\!\mathbb{ In particular, the worst-case operator-norm error obeys

Figures (8)

  • Figure 1: Convergence comparison of LoRA and bottleneck Adapter. Adapter not only starts with a lower loss but also reaches a lower steady-state value, indicating faster and more stable convergence.
  • Figure 2: RMSE versus parameter (bottleneck dimension $m$ for Adapter and rank $r$ for Truncation) budget for the two–layer MLP adapter (yellow) and the low-rank truncation baseline (orange). Left: transonic dataset ($M=1.0$). Right: low-Mach dataset ($M=0.1$).
  • Figure 3: Mean $L_2$ error versus the cut-off band index $k$ for $\text{Rand}\,M=1.0$ (upper left), $\text{Turb}\,M=1.0$ (upper right), and Rand $M=0.1$ (bottom).
  • Figure 4: Pipeline for inserting Frequency-Adaptive Adapters (F-Adapters) between consecutive pre-trained Fourier sub-modules in a Fourier layer in LOMs.
  • Figure 5: Side-by-side velocity field comparisons for Turbulence at epoch 500. From left to right: Vanilla Adapter, F-Adapter (Ours), and LoRA. Each panel shows ground-truth compared with prediction.
  • ...and 3 more figures

Theorems & Definitions (23)

  • Proposition 3.1: Block-wise LoRA lower bound
  • Proposition 3.2: Frequency-selective approximation of adapters
  • Proposition 3.3: Quantitative Low–/High–Frequency Energy Split for PDE Solution
  • Lemma 1: LoRA error lower bound
  • proof
  • Remark 1
  • Remark 2
  • Proposition B.1: Block-wise LoRA lower bound
  • proof
  • Lemma 2: Universal Approximation Theorem for Adapters
  • ...and 13 more