Table of Contents
Fetching ...

Inverse Neural Operator for ODE Parameter Optimization

Zhi-Song Liu, Wenqing Peng, Helmi Toropainen, Ammar Kheder, Andreas Rupp, Holger Froning, Xiaojie Lin, Michael Boy

Abstract

We propose the Inverse Neural Operator (INO), a two-stage framework for recovering hidden ODE parameters from sparse, partial observations. In Stage 1, a Conditional Fourier Neural Operator (C-FNO) with cross-attention learns a differentiable surrogate that reconstructs full ODE trajectories from arbitrary sparse inputs, suppressing high-frequency artifacts via spectral regularization. In Stage 2, an Amortized Drifting Model (ADM) learns a kernel-weighted velocity field in parameter space, transporting random parameter initializations toward the ground truth without backpropagating through the surrogate, avoiding the Jacobian instabilities that afflict gradient-based inversion in stiff regimes. Experiments on a real-world stiff atmospheric chemistry benchmark (POLLU, 25 parameters) and a synthetic Gene Regulatory Network (GRN, 40 parameters) show that INO outperforms gradient-based and amortized baselines in parameter recovery accuracy while requiring only 0.23s inference time, a 487x speedup over iterative gradient descent.

Inverse Neural Operator for ODE Parameter Optimization

Abstract

We propose the Inverse Neural Operator (INO), a two-stage framework for recovering hidden ODE parameters from sparse, partial observations. In Stage 1, a Conditional Fourier Neural Operator (C-FNO) with cross-attention learns a differentiable surrogate that reconstructs full ODE trajectories from arbitrary sparse inputs, suppressing high-frequency artifacts via spectral regularization. In Stage 2, an Amortized Drifting Model (ADM) learns a kernel-weighted velocity field in parameter space, transporting random parameter initializations toward the ground truth without backpropagating through the surrogate, avoiding the Jacobian instabilities that afflict gradient-based inversion in stiff regimes. Experiments on a real-world stiff atmospheric chemistry benchmark (POLLU, 25 parameters) and a synthetic Gene Regulatory Network (GRN, 40 parameters) show that INO outperforms gradient-based and amortized baselines in parameter recovery accuracy while requiring only 0.23s inference time, a 487x speedup over iterative gradient descent.
Paper Structure (11 sections, 10 equations, 6 figures, 3 tables)

This paper contains 11 sections, 10 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: ODE parameter optimization via the proposed Inverse Neural Operator (INO). INO recovers hidden ODE parameters from sparse partial observations across two benchmarks. (a) POLLU (chemical kinetics): 25 unknown reaction rate coefficients governing 20 chemical species. Parameters evolve from random initialization (light orange) toward the ground truth (dark orange), while the predicted trajectory simultaneously converges to the true ODE solution. (b) GRN (gene regulatory network): 40 active regulatory coefficients within a $20\times20$ interaction matrix. Residual heatmaps show that both the recovered parameters and the predicted gene expression trajectories converge to near-zero error after optimization.
  • Figure 2: Overall architecture of the proposed Inverse Neural Operator (INO). INO decouples forward surrogate learning from inverse parameter recovery across two stages. Stage 1 (CNO): a Conditional FNO with affine parameter modulation and Cross-Attention reconstructs the full ODE trajectory from sparse partial observations. Stage 2 (ADM): the frozen CNO acts as a forward evaluator only; pairwise residuals drive a kernel-weighted drifting field that transports random parameter initializations toward the ground truth without backpropagating.
  • Figure 3: Overall architecture of the proposed components. Left: the Conditional Neural Operator (CNO), consisting of conditional FNO (C-FNO) blocks and a Cross-Attention block producing the full ODE solution given ODE parameters and partial observations. Right: the Amortized Drifting Model (ADM), consisting of conditional MLP blocks (C-MLP) that learn a kernel-weighted drifting velocity field in parameter space, supervised without backpropagation through the surrogate.
  • Figure 4: Visual comparison of ODE optimization using state-of-the-art methods. Black dots are normalized true ODE parameters. Different colors visualize the mean and variance of different methods starting from different initializations.
  • Figure 5: Visualization of ODE parameter optimization on the GRN dataset. Left: MSE losses on ODE fitting and parameter updating over iterations. Middle: distribution of 40 parameters after optimization. Right: residual map between ground truth and final prediction.
  • ...and 1 more figures