Table of Contents
Fetching ...

Autonomous Adaptive Solver Selection for Chemistry Integration via Reinforcement Learning

Eloghosa Ikponmwoba, Opeoluwa Owoyele

Abstract

The computational cost of stiff chemical kinetics remains a dominant bottleneck in reacting-flow simulation, yet hybrid integration strategies are typically driven by hand-tuned heuristics or supervised predictors that make myopic decisions from instantaneous local state. We introduce a constrained reinforcement learning (RL) framework that autonomously selects between an implicit BDF integrator (CVODE) and a quasi-steady-state (QSS) solver during chemistry integration. Solver selection is cast as a Markov decision process. The agent learns trajectory-aware policies that account for how present solver choices influence downstream error accumulation, while minimizing computational cost under a user-prescribed accuracy tolerance enforced through a Lagrangian reward with online multiplier adaptation. Across sampled 0D homogeneous reactor conditions, the RL-adaptive policy achieves a mean speedup of approximately $3\times$, with speedups ranging from $1.11\times$ to $10.58\times$, while maintaining accurate ignition delays and species profiles for a 106-species \textit{n}-dodecane mechanism and adding approximately $1\%$ inference overhead. Without retraining, the 0D-trained policy transfers to 1D counterflow diffusion flames over strain rates $10$--$2000~\mathrm{s}^{-1}$, delivering consistent $\approx 2.2\times$ speedup relative to CVODE while preserving near-reference temperature accuracy and selecting CVODE at only $12$--$15\%$ of space-time points. Overall, the results demonstrate the potential of the proposed reinforcement learning framework to learn problem-specific integration strategies while respecting accuracy constraints, thereby opening a pathway toward adaptive, self-optimizing workflows for multiphysics systems with spatially heterogeneous stiffness.

Autonomous Adaptive Solver Selection for Chemistry Integration via Reinforcement Learning

Abstract

The computational cost of stiff chemical kinetics remains a dominant bottleneck in reacting-flow simulation, yet hybrid integration strategies are typically driven by hand-tuned heuristics or supervised predictors that make myopic decisions from instantaneous local state. We introduce a constrained reinforcement learning (RL) framework that autonomously selects between an implicit BDF integrator (CVODE) and a quasi-steady-state (QSS) solver during chemistry integration. Solver selection is cast as a Markov decision process. The agent learns trajectory-aware policies that account for how present solver choices influence downstream error accumulation, while minimizing computational cost under a user-prescribed accuracy tolerance enforced through a Lagrangian reward with online multiplier adaptation. Across sampled 0D homogeneous reactor conditions, the RL-adaptive policy achieves a mean speedup of approximately , with speedups ranging from to , while maintaining accurate ignition delays and species profiles for a 106-species \textit{n}-dodecane mechanism and adding approximately inference overhead. Without retraining, the 0D-trained policy transfers to 1D counterflow diffusion flames over strain rates --, delivering consistent speedup relative to CVODE while preserving near-reference temperature accuracy and selecting CVODE at only -- of space-time points. Overall, the results demonstrate the potential of the proposed reinforcement learning framework to learn problem-specific integration strategies while respecting accuracy constraints, thereby opening a pathway toward adaptive, self-optimizing workflows for multiphysics systems with spatially heterogeneous stiffness.

Paper Structure

This paper contains 36 sections, 21 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Reinforcement learning (RL) framework for adaptive solver selection. The RL agent observes the evolving thermo-chemical state variables (e.g., $T$, OH, H$_2$, O, etc.) from the combustion simulation and selects the optimal numerical solver (e.g., CVODE or QSS) at each step.
  • Figure 2: RL-adaptive performance across initial conditions: (a) speedup distribution and (b) dependence on $(T_0,p)$. Accuracy is maintained within $2.6\%$ ignition-delay error and <$110~\mathrm{K}$ temperature RMSE.
  • Figure 3: 0D trajectory comparisons across thermochemical conditions using the RL-adaptive solver policy. (a) Condition 1 (650K, 1.0atm) achieves a 3.25$\times$ speedup with 0.41% ignition delay error and 5.4% CVODE usage. The temperature subplot shows solver selection via colored scatter (red=CVODE, green=QSS), with CVODE concentrated in a narrow ignition window ($t \approx 15$--22 ms). (b--d) As temperature and pressure increase (hence stiffness), CVODE usage rises from 14% (b) to 68.0% (c,d), demonstrating adaptive resource allocation. In all cases, the RL-adaptive approach maintains ignition delay errors below 2.62% and temperature RMSE below 108K.
  • Figure 4: Space-time evolution of temperature, H2O, and OH illustrating agreement between the RL-adaptive solution and the CVODE reference, and deviations introduced by pure QSS.
  • Figure 5: Spatiotemporal solver deployment in the 1D counterflow diffusion flame at strain rate $2000~\mathrm{s}^{-1}$, as selected by the RL policy. Red denotes CVODE and turquoise denotes QSS.