Table of Contents
Fetching ...

Staying Alive: Online Neural Network Maintenance and Systemic Drift

Joshua E. Hammond, Tyler Soderstrom, Brian A. Korgel, Michael Baldea

TL;DR

This work tackles online maintenance of neural networks trained on physical dynamical systems subject to slow parameter drift. It introduces the Subset Extended Kalman Filter (SEKF), which updates only a gradient-selected subset of parameters, guided by loss sensitivity and two quantile-based selection schemes, enabling real-time adaptation with reduced computational cost compared to retraining or full EKF. Across four dynamic regression case studies (one-dimensional drift, CSTR drift, diabetic insulin sensitivity drift, and FCC process changes), SEKF maintains or improves prediction accuracy while delivering substantially lower mean time per iteration and fewer hyperparameter tuning requirements. The results demonstrate the practical potential of online, data-efficient model maintenance for control and optimization tasks under drift, with broader implications for continual adaptation of large neural models.

Abstract

We present the Subset Extended Kalman Filter (SEKF) as a method to update previously trained model weights online rather than retraining or finetuning them when the system a model represents drifts away from the conditions under which it was trained. We identify the parameters to be updated using the gradient of the loss function and use the SEKF to update only these parameters. We compare finetuning and SEKF for online model maintenance in the presence of systemic drift through four dynamic regression case studies and find that the SEKF is able to maintain model accuracy as-well if not better than finetuning while requiring significantly less time per iteration, and less hyperparameter tuning.

Staying Alive: Online Neural Network Maintenance and Systemic Drift

TL;DR

This work tackles online maintenance of neural networks trained on physical dynamical systems subject to slow parameter drift. It introduces the Subset Extended Kalman Filter (SEKF), which updates only a gradient-selected subset of parameters, guided by loss sensitivity and two quantile-based selection schemes, enabling real-time adaptation with reduced computational cost compared to retraining or full EKF. Across four dynamic regression case studies (one-dimensional drift, CSTR drift, diabetic insulin sensitivity drift, and FCC process changes), SEKF maintains or improves prediction accuracy while delivering substantially lower mean time per iteration and fewer hyperparameter tuning requirements. The results demonstrate the practical potential of online, data-efficient model maintenance for control and optimization tasks under drift, with broader implications for continual adaptation of large neural models.

Abstract

We present the Subset Extended Kalman Filter (SEKF) as a method to update previously trained model weights online rather than retraining or finetuning them when the system a model represents drifts away from the conditions under which it was trained. We identify the parameters to be updated using the gradient of the loss function and use the SEKF to update only these parameters. We compare finetuning and SEKF for online model maintenance in the presence of systemic drift through four dynamic regression case studies and find that the SEKF is able to maintain model accuracy as-well if not better than finetuning while requiring significantly less time per iteration, and less hyperparameter tuning.

Paper Structure

This paper contains 26 sections, 28 equations, 8 figures, 13 tables.

Figures (8)

  • Figure 1: Simulation of the system in Example \ref{['ex:simpleexample']}. a Solution of the system \ref{['eq:example1']} in blue, simulation of the layer equation \ref{['eq:goveq_layereq']} in orange, and the results of the iterative scheme \ref{['eq:goveq_layereq_iterative']} is shown with dashed lines. b a neural network approximation of the system where the neural network is iteratively retrained to reflect recent dynamics.
  • Figure 2: a The system dynamics shown in grey drift away from the original dynamics as shown by the disparity between the true concentrations and the prediction of the original model in dashed blue. b NODE architecture used.
  • Figure 3: Results for the One-Dimensional System with Parameter Drift. a the lowest-error result of each maintenance method for the test data. b summarized results for the One-Dimensional System with Parameter Drift with 95% confidence bounds for each maintenance method operating on all parameters, and the best result on a subset of parameters.
  • Figure 4: a summary statistics for the CSTR with Reaction Rate Constant Drift b summary statistics for the Simulated Type-II Diabetic Patient with Insulin Sensitivity Drift. In both cases, the SEKF outperforms finetuning and is able to update a subset of model parameters without significant loss in performance (if any) and in the case of the larger model of the diabetic patient, operating on a subset of model parameters causes a significant decrease in the time required to perform model maintenance.
  • Figure 5: a-e mean nMSE and MTPI as well as 95% confidence intervals for process change Fluid Catalytic Cracker Unit system.
  • ...and 3 more figures

Theorems & Definitions (2)

  • Example 1
  • Example 2