Prospective Learning: Learning for a Dynamic Future
Ashwin De Silva, Rahul Ramesh, Rubing Yang, Siyu Yu, Joshua T Vogelstein, Pratik Chaudhari
TL;DR
Prospective Learning (PL) reframes learning for dynamic environments where data distributions and objectives evolve, enabling time-aware prediction through a sequence of predictors and a time-augmented loss framework. The paper defines Prospective Risk and Prospective Bayes Risk, introduces Prospective ERM as a strong learner under consistency and concentration conditions, and provides theoretical guarantees for convergence to the Bayes limit in time-varying settings. Empirical validation on synthetic data, MNIST, and CIFAR-10 shows Prospective ERM can track changing tasks and reduce prospective risk, unlike standard ERM and many online continual-learning baselines. The work also explores discounted losses and periodic/Markovian dynamics, discusses connections to related paradigms, and lays groundwork for scalable, time-aware learning in nonstationary real-world systems.
Abstract
In real-world applications, the distribution of the data, and our goals, evolve over time. The prevailing theoretical framework for studying machine learning, namely probably approximately correct (PAC) learning, largely ignores time. As a consequence, existing strategies to address the dynamic nature of data and goals exhibit poor real-world performance. This paper develops a theoretical framework called "Prospective Learning" that is tailored for situations when the optimal hypothesis changes over time. In PAC learning, empirical risk minimization (ERM) is known to be consistent. We develop a learner called Prospective ERM, which returns a sequence of predictors that make predictions on future data. We prove that the risk of prospective ERM converges to the Bayes risk under certain assumptions on the stochastic process generating the data. Prospective ERM, roughly speaking, incorporates time as an input in addition to the data. We show that standard ERM as done in PAC learning, without incorporating time, can result in failure to learn when distributions are dynamic. Numerical experiments illustrate that prospective ERM can learn synthetic and visual recognition problems constructed from MNIST and CIFAR-10. Code at https://github.com/neurodata/prolearn.
