Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences

Dimitris Bertsimas; Vassilis Digalakis; Yu Ma; Phevos Paschalidis

Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences

Dimitris Bertsimas, Vassilis Digalakis, Yu Ma, Phevos Paschalidis

TL;DR

This work tackles unstable ML retraining by proposing Slowly Varying ML Sequences (SVML) that explicitly balance predictive power and structural stability across data batches. It grounds the approach in a model-agnostic framework with a Pareto-optimality guarantee, supported by a tractable MIO-based algorithm and a polynomial-time restricted formulation that reduces to a shortest-path problem. The authors demonstrate theoretically and empirically that modest sacrifices in accuracy yield substantial gains in stability and interpretability, validated across regression, decision trees, boosted trees, and neural networks in healthcare, vision, and language settings, including a hospital deployment. The practical impact lies in enabling trustworthy, deployable retraining pipelines that preserve analytical insights while maintaining competitive predictive performance. Future work aims to handle anticipated distribution shifts and incorporate explicit cost considerations into the stability framework.

Abstract

We consider the problem of retraining machine learning (ML) models when new batches of data become available. Existing approaches greedily optimize for predictive power independently at each batch, without considering the stability of the model's structure or analytical insights across retraining iterations. We propose a model-agnostic framework for finding sequences of models that are stable across retraining iterations. We develop a mixed-integer optimization formulation that is guaranteed to recover Pareto optimal models (in terms of the predictive power-stability trade-off) with good generalization properties, as well as an efficient polynomial-time algorithm that performs well in practice. We focus on retaining consistent analytical insights-which is important to model interpretability, ease of implementation, and fostering trust with users-by using custom-defined distance metrics that can be directly incorporated into the optimization problem. We evaluate our framework across models (regression, decision trees, boosted trees, and neural networks) and application domains (healthcare, vision, and language), including deployment in a production pipeline at a major US hospital. We find that, on average, a 2% reduction in predictive power leads to a 30% improvement in stability.

Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences

TL;DR

Abstract

Paper Structure (36 sections, 6 theorems, 8 equations, 11 figures, 4 tables)

This paper contains 36 sections, 6 theorems, 8 equations, 11 figures, 4 tables.

Introduction
Contributions
Related Work
A Framework for ML Model Retraining
Problem Setting
General Formulation
Pareto Optimality of Solutions
Pareto Excess Risk Bound
A Practical Algorithm for Model Retraining
Tractable Restricted Formulation
Adaptive Model Retraining
Measuring the Stability Loss
Structural Distances
Analytical Insight-based Distances
Comparing Distance Types
...and 21 more sections

Key Result

Theorem 2.4

A sequence $\boldsymbol{f^*}$ obtained by solving Problem eqn:formulation-constrained is WPO.

Figures (11)

Figure 1: Pareto frontier (in-sample and out-of-sample) between predictive power (aggregated across batches) and stability, obtained by solving Problem \ref{['eqn:formulation-constrained']} (full) or Problem \ref{['eqn:formulation-restricted']} (restricted for varying number of candidate models)
Figure 2: High-probability excess risk bound for linear regression across batches. The dotted line represents the independent bound \ref{['eq:independent-bound']}, while the solid line represents the Pareto-optimal bound \ref{['eq:pareto-bound']}. Left: $B = 5$; Right: $B = 10$.
Figure 3: Empirical validation of excess risk bounds.
Figure 4: Decision boundaries (top) and top feature importances (bottom) for logistic regression models.
Figure 5: Decision boundaries (top) and top feature importances (bottom) for decision tree models.
...and 6 more figures

Theorems & Definitions (15)

Remark 2.2
Definition 2.3
Theorem 2.4
Theorem 2.5
Theorem 2.6
Lemma 2.7
Theorem 3.1
Definition 3.2
Definition 3.3
Theorem 3.4
...and 5 more

Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences

TL;DR

Abstract

Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (15)