When to retrain a machine learning model
Regol Florence, Schwinn Leo, Sprague Kyle, Coates Mark, Markovich Thomas
TL;DR
This work tackles the practical problem of when to retrain machine learning models in the presence of continuous data drift, balancing retraining costs against future performance. It introduces UPF, an uncertainty-based retraining framework that forecasts future model performance using a Beta-distributed performance model, approximated by a Gaussian for tractable learning, and makes cost-aware decisions via a quantile-based rule. The approach rests on a principled objective that combines retraining costs and horizon-long performance, with theoretical bounds guiding retraining frequency and leveraging simple, data-efficient predictors. Empirical results on seven datasets show UPF consistently outperforms shift-detection baselines and CARA-like methods, even under mispecified retraining costs, highlighting its practical robustness and applicability in real-world, low-data settings.
Abstract
A significant challenge in maintaining real-world machine learning models is responding to the continuous and unpredictable evolution of data. Most practitioners are faced with the difficult question: when should I retrain or update my machine learning model? This seemingly straightforward problem is particularly challenging for three reasons: 1) decisions must be made based on very limited information - we usually have access to only a few examples, 2) the nature, extent, and impact of the distribution shift are unknown, and 3) it involves specifying a cost ratio between retraining and poor performance, which can be hard to characterize. Existing works address certain aspects of this problem, but none offer a comprehensive solution. Distribution shift detection falls short as it cannot account for the cost trade-off; the scarcity of the data, paired with its unusual structure, makes it a poor fit for existing offline reinforcement learning methods, and the online learning formulation overlooks key practical considerations. To address this, we present a principled formulation of the retraining problem and propose an uncertainty-based method that makes decisions by continually forecasting the evolution of model performance evaluated with a bounded metric. Our experiments addressing classification tasks show that the method consistently outperforms existing baselines on 7 datasets.
